Data@UCI Embarks on Data Science Adventures in Inaugural Datathon

May 4, 2023

Data@UCI hosted its first-ever datathon in UCI’s Interdisciplinary Science and Engineering Building on April 15-16. With a theme of “Embark,” the datathon aimed to inspire “new adventures and continued growth” in the field of data science for both Data@UCI and the wider community of data enthusiasts at UCI.

Colin Yee, a second-year data science major and vice president of community development at Data@UCI, says planning the club’s inaugural datathon would not have been possible without the dedication and marketing efforts of their board members and interns. 

“Since this is our first datathon, we had to essentially start from scratch,” says Yee. “One of the biggest challenges was securing data challenges and funding for our event. We’ve sent hundreds of emails and LinkedIn messages to different companies and organizations with a dozen at most responding back … we were very glad of the sponsors and funding we were able to receive.”

According to Data@UCI president and third-year data science major William Hou, the club’s hard work paid off — the datathon exceeded Hou’s expectations before it officially began.

“Our team has never put on an event like this before so there was a lot of uncertainty, especially when we first started planning, around whether people would actually sign up,” says Hou. “We ended up not only meeting our attendance goal but actually having to waitlist some of our applicants due to capacity limitations.”

Hou also expresses excitement about successfully putting a data-related spin on traditional software hackathons, adding that Data@UCI plans to host a “bigger and better” datathon next year. Data@UCI hopes to build a strong reputation for its collegiate datathon in the years to come. In the meantime, Hou and Yee encourage students to continue participating in datathons.

“You have the opportunity to work in a team and analyze data from different companies and organizations,” says Yee. “A datathon like this can help motivate students to learn different data-related applications and to implement the skills they gained from UCI outside of the classroom!”

Here are the winning projects from the Embark datathon:

Best Overall & People’s Choice – Mischief in Montreal
The Mischief in Montreal team used data analytics and machine learning to analyze a dataset of over 250,000 crime reports to analyze the crimes, political landscape, and effectiveness of existing crime-related policies in Montreal. With their findings, they created a few policies to address crime in the city.

Mischief in Montreal was created by:

  • Maithy Le – second-year computer science major, UCI
  • Audrey Nguyen – second-year computer science major, UCI
  • Sandra Nguyen – second-year computer science major, California State University, Fullerton
  • Jingqi Yao – fourth-year computer science & mathematics double major, UCI

Runner Up – Montreal Hockey Shoots Into Crime
As implied by their project name, with Montreal Hockey Shoots Into Crime students analyzed crime data in Montreal and compared it to the win/loss outcomes of the Montreal Canadiens hockey team. They found that winning is correlated with an increase in crime and proposed policy solutions to decrease Montreal’s crime rates.

Montreal Hockey Shoots Into Crime was created by: 

Board’s Pick – Goofy Ahh Scientists: Montreal Crime Analysis
The Montreal Crime Analysis team conducted an analysis of crime data in Montreal and noted key findings such as the top crimes in the city and precincts with the most police presence. They also created an interactive dashboard that displays crime data for each precinct.

Montreal Crime Analysis was created by:

  • Sean Fong – second-year computer science major, UCI
  • John Lorenzini – second-year computer science major, UCI
  • Neel Ramesh – second-year computer science & engineering major, UCI
  • Isaac Rico – second-year computer engineering major, UCI

Best Visualization – Montreal Crime Space-Time Analysis
Through exploratory analysis using maps and facet plots, the students behind Montreal Crime Space-Time Analysis attempted to identify patterns or trends in Montreal’s crimes based on location, time of day, season and other factors. They also explored if machine learning models could be used to predict crime.

Montreal Crime Space-Time Analysis was created by:

  • Randy Huynh – third-year computer science major with a statistics minor, UCI
  • Kevin Wu – third-year computer science major with a statistics minor, UCI
  • Hao Li – fourth-year math major with a statistics minor, UCI
  • Yiqin Chen – fourth-year business information management and data science double major 

Best Presentation – Beach Consulting Montreal Crime Analysis
Taking on the role of data consultants, the Beach Data Consulting team’s goal was to use data-driven insights to improve citizen safety. They cleaned and enriched Montreal’s crime data using Python and geocoding APIs, and then analyzed trends over time, top crime types and worst crime areas to create their policy recommendations.

Beach Consulting Montreal Crime Analysis was created by:

  • Jason Vo – third-year finance major with a computer science minor, California State University, Long Beach
  • Victor Guan – fourth-year data science major, UCI
  • Chandler Sidars – fourth-year management information systems major with a cybersecurity applications minor, California State University, Long Beach
  • Peter John Villasista – fourth-year economics major with a finance minor, California State University, Long Beach

Social Impact Award – Montrealidays
Montrealidays focused on analyzing Montreal’s crime data to rebuild a sense of safety and community in the city after the peak of the COVID-19 pandemic. They proposed the city invest in recreation, culture and community events to promote positive social interactions as a means of potentially reducing crime rates. 

Montrealidays was created by:

  • Remi Inoue – third-year mathematics major with economics & statistics minors, UCI
  • Olivia Lin – third-year mathematics major with French & statistics minors, UCI
  • Minh Nguyen – third-year computer science major with informatics & psychology minors, UCI

Best Use of MATLAB (MathWorks) – Does Race and Ethnicity Correlate with a Sepsis Diagnosis?
The students behind Does Race and Ethnicity Correlate with a Sepsis Diagnosis? aimed to visualize data on sepsis and draw conclusions using variables such as gender, race, previous occurrence of sepsis and more. Although they ran into challenges with data modeling, they learned about the importance of comprehensive and unbiased data collection methods.

Does Race and Ethnicity Correlate with a Sepsis Diagnosis? was created by:

  • Vincent Carluccio – third-year data science & Japanese language and literature double major, UCI
  • Kenny Chen – third-year data science major with a health informatics minor, UCI
  • August Vu – third-year civil engineering major, UCI
  • Emily Wang – second-year mathematics major, UCI

Best Melissa Data Project (Melissa) – Conquering the Impossible!
Conquering the Impossible! is an address lookup tool that functions similarly to a search engine, generating the most likely address corresponding to a given input. Moving forward, the creators hope to refine the cleaning process further to improve the accuracy of results. 

Conquering the Impossible! was created by:

Best StrataScratch Project (StrataScratch) – Safety First: An Analysis on Montreal’s Crime Data
The Safety First project graphs Montreal’s crime data to identify crime hotspots in the city and the types of crimes in each location. Users can also provide an address input to obtain a ranking of the likelihood of the recurrence of each type of crime in that area.

Safety First was created by:

  • Luc Nguyen – first-year computer science major, Orange Coast College
  • Azra Zahin –  first-year computer science major, UCI
  • Ngoc Huynh – second-year software engineering major, UCI
  • Sofia Perez de Tudela – second-year software engineering major, UCI

Best UCI ODIT Project (UCI ODIT) – Sepsis: A Case Study
Recognizing the severity of sepsis, the Sepsis: A Case Study team sought to create a project that raises awareness about this medical issue. To do so, they created a machine learning model that predicts the likelihood of contracting sepsis based on factors such as age, race and gender. 

Sepsis: A Case Study was created by:

  • Kyle Huynh – first-year data science major, UCI
  • Veronic Trinh – first-year undeclared major, UCI
  • Ryan James (RJ) Calabio third-year quantitative economics major, UCI
  • Crystal Popeney – third-year cognitive science major with an ICS minor, UCI

View the rest of the projects on the EMBARK Datathon 2023 Devpost.

— Karen Phan

Photos courtesy of Data@UCI