Duke University, Department of Mathematics

Program ID: Duke-DATA2018 [#630]
Program Title: Data+ 2018
Program Location: Durham, North Carolina 27708-0320, United States [map]
Subject Areas: Data Science, Interdisciplinary
Application Deadline: 2018/02/24 11:59PMhelp popup finished (2017/12/08, finished 2018/08/26)
Program Description:    

*** this program has been closed, and no new applications is accepted. ***

Data+ is a full-time ten week summer research experience that welcomes Duke undergraduate and masters students interested in exploring new data-driven approaches to interdisciplinary challenges. It is suitable for students from all class years and from all majors.

Students join small project teams (at most 3 undergrads and 1 masters per team), working alongside other teams in a communal environment. They learn how to marshal, analyze, and visualize data, while gaining broad exposure to the modern world of data science. The projects (see below) come from an extremely diverse set of subject areas. It is our hope that students will be able to both work deeply into their specific project and get a very broad picture of most of the skills needed for modern data science.

Participants will receive a $5,000 stipend, out of which they must arrange their own housing and travel . Funding and infrastructure support are provided by a wide range of departments, schools, and initiatives from across Duke University, as well as by outside industry and community partners. Participants may not accept employment or take classes during the program; this requirement is strictly enforced and non-negotiable..

The program runs from May 29th until August 3rd, 2018. The application deadline is Feb. 24, 2018, but we will evaluate applications on a rolling basis, so please get your applications in as soon as you can!

You will find the projects planned for summer 2018 in the numbered list below. Click on the project names to learn more. Please indicate the number of the projects you choose when you apply; you may list up to three choices in ranked order of preference. If you are seeing this page in December 2017, please note that more projects may be added in the coming weeks; there will be eventually be 25 projects listed!

For some projects, human subjects research training may be required and will be provided in advance. With each project, we have attempted to list potential majors and/or interests that might be best suited for the project, but these should not be seen as requirements in any way! Quantitative STEM majors like mathematics, computer science, statistics, and electrical engineering are relevant to all.


1) Poverty in Writing and Images What do we mean by the term “poverty”? A team of students under the direction of Professor Astrid Giugni will analyze how the way we talk about poverty and public policy has changed over time. The team will work with two databases containing visual, textual, and audio documents from 1473 to the present, allowing students to track and analyze how our understanding of poverty has changed over time. The group will tackle the challenge of analyzing the political and popular language and imagery of poverty in order to create a visualization that contextualizes how financial and welfare policy is influenced by how we talk about poverty. English, Literature, History, Public Policy, Political Science, all Quantitative STEM.

2) Pirating TextsA team of students led by UNC-CH graduate student Grant Glass and Duke English professor Charlotte Sussman will track the thousands of Daniel Defoe’s Robinson Crusoe editions – including the plethora of movies and “Robinsoniades,” most of which are deviations from Defoe’s original work. By examining the differences in these stories –through word-vector models and categorization algorithms, we can trace how the deviations often reflect the place and time of their production and consumption, evoking a range of questions that further our understanding of how the expanse and collapse of the British Empire is wrapped up in notions of capitalism, race, empire, gender, and climate concerns. English, Literature, History, Geography, Visual & Media Studies.

3) Improving the Machine Learning Pipeline at Duke A team of students will contribute to an effort to operationalize the application of distributed computing methodologies in the analysis of electronic medical records (EMR) at Duke. Students will then use these systems to execute natural language processing (NLP) on clinical narratives and radiology notes with existing, ongoing analyses of Duke data. This Data+ team will work with the Duke Forge, an interdepartmental collaboration focused on data science research and innovation in health and biomedical sciences. PreHealth/PreMed, BME, Economics, Biostatistics, all quantitative STEM,

4) Mental Health Intervention by the Durham Police A team of students lead by Dr. Nicole Schramm-Sapyta of the Duke Institute for Brain Sciences will provide analytical consulting support to the Durham Crisis Intervention Team (CIT) Collaborative, a county-wide effort to provide law enforcement and first responders with specialized training in mental illness and crisis intervention techniques. The team will build on last summer’s descriptive analysis of 9-1-1 call data by incorporating data from partner agencies to assess whether CIT training reduces recidivism, increases utilization of mental health services, and generally improves the lives of Durham citizens with mental illness. All Social Sciences, All Quantitative STEM.

5) Social Determinants of Health A team of students led by faculty and researchers at the Social Science Research Institute will bring together data that will facilitate research using social determinants of health (SDH) to examine, understand, and ameliorate health disparities. This project will identify SDH variables that have the potential to be linked to data from the MURDOCK Study, a longitudinal health study based in Cabbarus County, NC. Sociology, Public Policy, PreHealth/PreMed, Global Health, Environmental Science, all quantitative STEM.

6) Women's Spaces How are women influenced by the spaces that they are allowed to occupy? A group of students, led by English Professor Charlotte Sussman, will examine how the spaces and places women can inhabit have changed over time, and how such changes have affected women’s rights and opportunities. The team will analyze the visual representations of women depicted in magazines from the nineteenth to the twenty-first century through the Women’s Magazine Archive, considering how images about women influence the reality that women can both imagine and live. English, Literature, History, Visual & Media Studies, Gender, Sexuality, & Feminist Studies, all quantitative STEM.

7) Rare Metabolic Diseases A team of students lead by Rachel Richesson (Duke University School of Nursing) will explore patterns of health care treatment and utilization for several rare metabolic disorders treated at Duke University Health System. Students will gain an understanding of medical data, the use of reference terminologies to generate new relationships and inferences, and various data analysis and visualization techniques to describe and compare the clinical profiles of patients with different conditions. PreHealth/PreMed, Biology, BME, Biostatistics, all quantitative STEM.

8) Deep Learning for Single Cell Analysis A team of students led by a computational biologist and a cell biologist will develop methods to identify cell subsets and their developmental, maturation and activation lineage relationships using deep learning approaches. Students will learn to process single cell RNA sequencing data and use the Python programming language and TensorFlow to characterize lung stem cells involved in wound healing. This work will help Duke researchers establish a deep learning pipeline for single cell analysis with applications in immunology, cell biology and cancer. Biology, Biomedical Engineering, PreHealth/PreMed, Biostatistics, all Quantitative STEM.

9) Visualizing the Lives of Orphaned and Separated Children A team of students led by researchers in the Center for Health Policy and Inequalities Research will develop a platform that visualizes significant life events across time for more than 3,000 orphaned and separated children in Cambodia, Ethiopia, India, Kenya, and Tanzania from the Positive Outcomes for Orphans (POFO) study. Anthropology, Sociology, History, Public Policy, Education, Global Health, PreMed/PreHealth, all Quantitative STEM

10) Energy Infrastructure Map of the World A team of students led by researchers in the Energy Data Analytics Lab and the Sustainable Energy Transitions Initiative will develop machine learning techniques for automatically mapping global electricity infrastructure using satellite imagery. By identifying substations, transmission lines, and distribution lines, students will create and publish a training dataset that we will use to automate grid infrastructure geolocation. These data and techniques will empower researchers and policymakers to better understand who has grid-connected access to electricity, who is underserved, and how to most efficiently transition communities and countries towards sustainable electrification. Environmental Science, Economics, all quantitative STEM.

11) How do we build and grow a PTA? A team of students led by Glenn Elementary School Parent Teacher Association (PTA) President, David Vanie, will explore publicly available data in order to develop a set of metrics that serve to understand the needs of the GSE parent community in a holistic way. The data will identify potential obstacles that are barriers for parent involvement, and will inform best practices for increasing participation throughout the 2018-2019 school year at GSE. The work will be used to provide helpful insight for engaging parents in PTA organizations at public schools throughout Durham, and across the country. Public Policy, Sociology, Anthropology, History, Geography, Education, Political Science, Economics.

12 Data and Technology for Fact-Checking Today, our society is struggling with an unprecedented amount of misinformation and disinformation. A team of students led by researchers in the Duke Reporters’ Lab and Department of Computer Science will build databases, systems, and apps to help fact-checkers combat falsehoods and hyperboles, and disseminate their fact-checks to the public. The team will apply database, machine learning, algorithmic, and app development techniques to scout media and public interest for check-worthy claims, and alert media consumers to previously checked claims instantly. Political Science, Journalism, Public Policy, Anthropology, all quantitative STEM.

13 Smartphones and the Sixth Vital Sign A team of students led by Janet Bettger and an interdisciplinary team with the 6th Vital Sign Study will use Census and other public data to examine the representativeness of people who participated in this smartphone based population health study. Students will design an online interactive map and other web-based tools that can be easily updated with new study participants illustrating key relationships such as health status with rurality, medical service availability, and sociodemographics. PreMed/PreHealth, Geography, BME, all quantitative STEM.

14) Big Data for Reproductive Health A team of students led by clinical and non-clinical global reproductive health researchers at the Duke Global Health Institute will develop an interactive, web-based platform that curates raw data on contraceptive discontinuation from the Demographic and Health Surveys (DHS) into a tool to help researchers and family planning advocates develop fresh insights around contraceptive discontinuation. Students will develop and refine the prototype, debut it with experts in online data visualization platforms at RTI and prepare a dissemination plan for the tool. Students will have an opportunity to pilot creative ways to incorporate social media data into the tool and ways to validate this data against ground-truth data from population representative surveys. Global Health, PreHealth/PreMed, Gender, Sexuality and Feminist Studies, Public Policy, all Quantitative STEM.

15) Vaccine Hesitancy and Uptake Despite overwhelming scientific evidence on the benefits of vaccinations, pregnant women and parents of young children often refuse to accept vaccinations for themselves or their children. As part of larger study to understand vaccine hesitancy locally, a group of students will conduct secondary data analysis of the coverage and timeliness of maternal and pediatric vaccines in Durham, and identify determinants of timely vaccination uptake. Results may inform the development of interventions to reduce hesitancy and improve the coverage and timeliness of maternal and pediatric vaccine uptake in Durham. Global health, Journalism, Public Policy, Premed/PreHealth, BioStatistics, Visual & Media Studies, all quantitative STEM.

16 Gerrymandering and the Extent of Democracy in America A team of students led by Professors Jonathan Mattingly and Gregory Herschlag will investigate gerrymandering in political districting plans. Students will improve on and employ an algorithm to sample the space of compliant redistricting plans for both state and federal districts. This work will continue the Quantifying Gerrymandering project, seeking to understand the space of redistricting plans and to find justiciable methods to detect gerrymandering. Political Science, Public Policy, Sociology, Geography, all quantitative STEM.

17 Complex Decisions, Real Numbers Would you like to know what influences patients’ medical decisions when outcomes are uncertain? Using a big data approach, we will explore a large number of physician-patient conversations and disentangle the complex decision-making process. Students will be introduced not only to data science but also to behavioral research and aspects of communication in healthcare. This work will inform physicians on how to reduce overutilization of unnecessary interventions and ensure the well-being of patients. Economics, PreHealth/PreMed, Biology, Biostatistics, all quantitative STEM.

18) Analytical Exploration for Duke Development A team of students will work with Duke's Office of Development & Alumni Affairs to understand how cutting-edge data analytic techniques, such as sentiment analysis and network analysis, can be used to understand a variety of giving behaviors and trajectories. Students will work with de-identified data in a secure computing environment, and will have a rich opportunity for creative exploration in consultation with Development professionals. Anthropology, Economics, Sociology, Psychology, All quantitative STEM

19) Data-driven Improvement of Datacenter Performance A team of students, under the direction of Prof. Benjamin C. Lee, will explore how a variety of statistical machine-learning techniques may be able to improve datacenter performance. The team will have frequent opportunities to interact with analytics leadership at Lenovo. Economics, All quantitative STEM.

20) Duke Wireless Data A team of students in conjunction with Duke’s Office of Information Technology will make use of Duke’s wireless network data to build detailed maps of wireless coverage, strength and utilization across campus. Students will work directly with the network data and have access to the analytics tools used in OIT, and will have a great opportunity for exploration of the data in consultation with OIT network, security and data analytics professionals. All quantitative STEM.

21) Co-curricular Technology Pathways A team of students will work with Duke’s Office of Information Technology to conceptualize and potentially develop an “e-advisor” program that will help students navigate, augment, and map their way through Duke’s co-curricular ecosystem. The team of students will identify available data, programs and resources, define learning objectives, recommend common pathways and create a storyboard of the program building out a “master narrative” experience and prototype the branching and decision engine. Education, Social Sciences, all quantitative STEM.

22) Data and the Global Corporate Bond Market A team of students will work with Professor Emma Rasiel to understand whether an analysis of credit spreads on bonds issued by international firms in multiple countries over time can shed light on potential arbitrage opportunities. The team will have frequent opportunities to interact with analytics professionals at a leading financial advisory and asset management firm. Economics, all quantitative STEM.

23) Construction Machinery and the Business Cycle A team of students will consult with a leading financial advisory and asset management firm that is seeking to understand how big data can shed light on the secondary market for construction machinery. Students will explore a combination of publicly-available datasets that describe the used-machinery market and its potential implications as an indicator for the business cycle. There will be frequent interactions with analytical professionals from the firm. Economics, Civil Engineering, all quantitative STEM.

24) Maximizing Data Communication for Faster Energy Access Power for All’s Platform for Energy Access Knowledge (PEAK) is an interactive knowledge platform designed to automatically curate, organize, and streamline large, growing bodies of data into digestible, sharable, and useable knowledge through automated data capture, indexing, and visualization. A team of students will consult with Power for All to creatively visualize PEAK’s library, and to explore machine learning and natural language processing tools that can enable auto-extraction and visualization of data for more effective science communication. Environmental Science, Ecology, Economics, all Quantitative STEM.

Application Materials Required:
Submit the following items online at this website to complete your application:
And anything else requested in the program description.

Further Info:
Mathematics Department
Duke University, Box 90320
Durham, NC 27708-0320

© 2020 MathPrograms.Org, American Mathematical Society. All Rights Reserved.