Data Science Directed Research Spring 2019
Undegraduate students from the Data Science program at Florida Polytechnic University conducted research and produce significant written documentation of an experiment, research exploration, or special interest project in data science. Some of the projects are described in this page.
Superb job by @Khanzi_w from @FLPolyU for his final project. Text mining on the popularity of the 2020 Presidential election candidates using Twitter data. Check his app here: https://t.co/O6LeTRTmIT#rstats #rtweet #DataScience #shiny @drob @juliasilge
— Rei Sanchez-Arias (@reisanar) April 9, 2019
Projects Description
Twitter 2020 Political Sentiment by Kahlil Wehmeyer
This app was created with the intention of getting a better understanding of the sentiment towards and surronding presedential candidates of the 2020 election for the Unites States of America.
This app is, in a way, tracking the candidate popularity. Favorite and retweet counts are a somewhat reasonable measure of how the public feels about a candidates message. References include: Text Mining with R, Shiny Apps, R for Data Science, and Twitter Developer. Some of the key packages for this project were: Tidyverse, Plotly, Tidytext, Topic Models, igraph, and ggraph
Click here to learn more about Kahlil
“Graduating undergraduate student and Florida Polytechnic University in Lakeland, Florida. Obtaining a bachelor’s degree in Data Analytics with a concentration in Big Data Analytics. Currently working as a Data Analytics intern at Yes.Fit. President and founder of the Data Analytics and Data Science club at Florida Poly. Looking forward to graduating in the coming year and starting a professional career in data science.” - Spring 2019
Relocation Station by Natalie Brum and Erich Mengore
This project aimed to help a new Data Science graduate find the best city to relocate based on economic factors like cost of living and average salary as well as natural factors like the climate and the natural disasters in an area.
Data Sources: Cost Of Living 2018 from Numbeo.com, Regional price parity, and Per Capita Personal Income in 2016 compiled by the Bureau of Economic Analysis (BEA.gov ), Natural Disaster Map, The Ready Store, Natural Disaster Locations
R Packages such as Geonames
and GSODR
were used in this project. Results will be showcased in the form of a Shiny App
Click here to learn more about Natalie
“Natalie Brum, a senior at Florida Polytechnic University is studying Data Analytics. Natalie’s involvement outside of the classroom, allowed her to attain a Manufacturing Systems internship at 3M during her junior year of college. As an intern, she was able to analyze the data of machines and use Neural Networks to predict future behavioral changes within these machines. She looks forward to trying new things, as a Project Engineering Intern at 3M this upcoming summer. Currently, Natalie is the President of the Society of Woman Engineers (SWE). She is also a Teaching Assistant for First Year Experience, and Concepts and Methods. “ - Spring 2019
Click here to learn more about Erich
“I am currently a senior at Florida Polytechnic University studying for Bachelors of Science in Data Analytics. My experience in the field includes working as a Systems Analyst intern Nielsen in Tampa, where I learned Database Design & Development and GoLang Backend development. My immediate goal after graduation is to gain more experience within the field focusing on Big Data and possibly Artificial Intelligence. My long-term goal in Data Analytics is to help tackle complex modern problems like Climate Change in order to help make for a better future. My current strongest skills include R, Programming Logic, and Unix/Linux (other skills listed in within my LinkedIn). All in all, I am looking forward to the future and willing to tackle anything that comes my way.” - Spring 2019
Mining Sleep Studies by Luke Rhon, Marc Burstein, and Kenneth Williams
The main goal of this project is to build a predictive model that explores student sleep behavior or performance based on input values. Using the findings from the 2012 study we can test those results by creating a model. Our findings so far have been trying to identify variables for our linear regression models. Although we established the main variables for each regression model it still does not explain most of the variance. Next steps included trying out decision trees to see if we can increase the predictive power. Models have been built, but additional investigation is required to improve the accuracy and do some fine tuning.
Key tools for the project include: tidyverse, scales, rpart
Data sources include: Data from a study of sleep patterns for college students. Original study: Onyper, S., Thacher, P., Gilbert, J., Gradess, S., “Class Start Times, Sleep, and Academic Performance in College: A Path Analysis”, Surveys from the National Sleep Foundation, The Student Life Dataset from Dartmouth
Click here to learn more about Luke
“I’ve been at Florida Polytechnic University for 4 years and hold a Senior Status. I enrolled wanting to study Computer Science with a concentration in Cyber Gaming but later switched to Data Science with a concentration in Big Data Analytics. My expected graduation data is December 2019. I am expecting to have a summer internship this year. I also currently work at the FPU Barnes & Noble Education Bookstore on campus as a bookseller. I’ve worked there for 3 years and learned about business operations and extensive customer support. My industry interests are Healthcare, Retail, and Marketing. These stem from my time working at the bookstore and my father working in healthcare area. When I’m not studying at university I like spend time with family and friends by watching movies, exploring the city, eating out, and attending local events.” - Spring 2019
Click here to learn more about Kenneth
“Currently a student in Florida Polytechnic University. Worked at multiple different food service jobs and has worked in many different classes such as Statistics, Advanced Quantitative Methods, and Data Mining. I am proficient in the use of R, Stata, and Bash, and have also used C, C++, and Python coding languages. I am a person who will work on a project as long as it takes to be as good as possible, but also am not afraid to ask for help when needed. I do work well with others and can work in a group to help achieve a goal. I have a Microsoft Office 2010 certification and have been able to make a couple of websites under the name of Ralph Smith.” - Spring 2019
Click here to learn more about Marc
“I am a senior at Florida Polytechnic University pursuing a degree in big data analytics. Please email me for more information.” - Spring 2019
Forecasting Florida Incarceration Trends by Jordan Douglass and Oliver Bennett
The initial goal of the project was to measure and reflect upon healthcare in the Florida Prison Systems. Unfortunately, Florida does not require a lot of data reporting in terms of healthcare for inmates. We began correlating the incarceration data, recidivism rates and social determinants of health between the years 2012-2017 for all counties in the state of Florida. Some of the social determinants included have been graduation rates, median household income, race and children in poverty. In our data exploration, we found many correlations within counties for these determinants and have been creating forecasting models using linear regression to determine what the next five years may look like for these specific couties given this data and how rates may be impacted if the data changed. For example, a question we are asking is, will Polk County have lower incarceration rates if the graduation rate goes up? Upon creation of the regression models, we hope that we may be able to create a web application that can be more insightful and interactive for users.
Tools used for this project include Tableau and Microsoft Excel.
Main References: Health Care Contracts Under Scrutiny Article, Privatizing Prisons Article, Overburdened Prison Systems in Florida Study
Other data sources include: List of Florida Institutions, Costs/Quality of Prison Care, Recidivism Report 2018, General Heath Outcomes in Florida Downloads, General Health Outcomes and Social Determinants Overview, and Urbanicity Trends and Incarceration Data
Click here to learn more about Jordan
“Jordan is a senior at Florida Polytechnic University majoring in Data Science with a concentration in Health Informatics. Experience wise, she has had an internship with Draken International doing web application development for their specific needs and has apart of Synapcare, a biomedical startup since May of 2018. My goal after graduation to is to join the United States Air Force as an officer and work in analytics.” - Spring 2019
Click here to learn more about Oliver
“Oliver Bennett is a student at Florida Polytechnic University Studying Data Science with a concentration in Health Informatics. He worked with Lakeland Regional Health as an intern in the data analytics department to create a request form aligned with a new business model and extracted actionable information from data and created dashboards for visualization. Experience includes Tableau, SQL, and Python with working knowledge in numerous other data analyzation software’s and has interests in machine learning for utilization in the financial markets.” - Spring 2019
Undergraduate Completion Rates Analysis by Jarrod Merriman
The purpose of this project is to build a model to predict the completion rate of undergradute students who complete their degree in four years or less than four years. The data being used is the CollegeScoreCard dataset from the U.S. Department of Education. The predictive model will be built using the tool WEKA.
Click here to learn more about Jarrod
“Jarrod Merriman is an undergraduate student at Florida Polytechnic University completing a Bachelor of Science in Data Analytics who is expecting to complete his degree by Spring of 2019. He started as an independent contractor performing general I.T. support but is now currently working as an intern at Lakeland Regional Health in the Data Analytics Department. He is currently working on a project to build diagrams in Tableau using data he has to manipulate using SQL.” - Spring 2019
Making Money With Houses by Geoffrey Ruiz
This project helps people decide where to purchase a property to be able to then rent out that property for a profit, this could help with retirement or just some passive income on the side as an investment. In this project to gain insight on the housing market I used data from Zillow. I then linked the data from Zillow to the Bureau of Labor Statistics using the BLSApi
package in R. I transformed the data from both sources using STATA because that is the program that I am the most
comfortable with. I also used STATA to do any of the statistical analysis to determine what cities were in the better positions for which market situation, meaning if you are in the market for a 2 bedroom then which city is the most ideal.
Main tools and sources: Zillow Data, US Bureau of Labor Statistics API in R, STATA
Click here to learn more about Geoffrey
“Geoffrey is a Big Data Analysis major at Florida Polytechnic University, with a concentration in Business Intelligence. He has worked on multiple different projects like making an mobile application that helps identify and give up to date regulations on any fish that the user has caught, doing advanced statistics on a non profit company, helping the Polk County Tax Collector become more efficient in their wait times, and forecasting the private sector employment growth of Ocala FL. Geoffrey also helped GTE become more helpful to its members being able to predict the next best product that the member is most likely to enroll into. All projects using his expertise in Data Analysis and his love for research. Geoffrey is a true Florida native growing up fishing and always staying close to the water. He even bought his first boat before he had a truck to pull it with. All of his best memories have been on the water fishing with his friends and family. Other than fishing Geoffrey loves to be in the gym placing 1st in multiple powerlifting meets he has participated in for his age and weight class.” - Spring 2019