Summer 2019 Data+ program features six dynamic energy projects

Posted On:

Tuesday, Jan 15, 2019 - 12:00 am

Data+ is a 10-week summer research experience that welcomes Duke undergraduates interested in exploring new data-driven approaches to interdisciplinary challenges. Students join small project teams, collaborating with other teams in a communal environment. They learn how to marshal, analyze, and visualize data, while gaining broad exposure to the modern world of data science.

For Summer 2019's program, students can choose from an unprecedented number of energy projects. From oil and gas production to smart meters, wholesale energy markets, energy access, and Duke's own energy use, these projects tackle a wide range of real-life energy problems. Check out the Data+ Fair on Thursday, January 17th to talk with project leads, and learn more and apply by the Feb. 25th deadline! For more on these six energy projects, check out the descriptions below:


Investigating oil and gas production in the United Kingdom

Producing oil and gas in the North Sea, off the coast of the United Kingdom, requires a lease to extract resources from beneath the ocean floor and companies bid for those rights. This team will consult with professionals at ExxonMobil to understand why these leases are acquired and who benefits. This requires historical data on bid history to investigate what leads to an increase in the number of (a) leases acquired and (b) companies participating in auctions. The goal of this team is to create a well-structured dataset based on company bid history from the U.K. Oil and Gas Authority; data which will come from many different file structures and formats (tabular, pdf, etc.). The team will curate these data to create a single, tabular database of U.K. bid history and work programs.


On the Shelf: exploring oil and gas production in the Gulf of Mexico

Producing oil and gas in the Gulf of Mexico requires rights to extract these resources from beneath the ocean floor and companies bid into the market for those rights. The top bids are sometimes significantly larger than the next highest bids, but it’s not always clear why this differential exists and some companies seemingly overbid by large margins. This team will consult with professionals at ExxonMobil to curate and analyze historical bid data from the Bureau of Ocean Energy Management that contains information on company bid history, infrastructure, wells, and seismic survey data as well as data from the companies themselves and geopolitical events. The stretch goal of the team will be to see if they can uncover the rationale behind historic bidding patterns. What do the highest bidders know that other bidders to not (if anything)? What characteristics might incentivize overbidding to minimize the risk of losing the right to produce (i.e. ambiguity aversion)?


Smart Meters and Real-time Electricity Consumption Monitoring Algorithms to Reduce Electricity Theft in Developing Countries

A team of students led by researchers in the Energy Access Project will develop means to evaluate non-technical electricity losses (theft) in developing countries through machine learning techniques applied to smart meter electricity consumption data. Students will use data from smart meters installed at transformers and households through a randomized control trial. Students will develop algorithms that can be used to detect anomalies in the electricity consumption data and create a dataset of such indicators.  This project will provide researchers with new ways of incorporating electricity consumption data and applications for electricity utilities in developing country settings.


A wider lens on energy: adapting deep learning techniques to inform energy access decisions

This team will explore how to develop machine learning techniques for analyzing satellite imagery data for identifying energy infrastructure that can be trained once and applied almost anywhere in the world. Led by researchers from the Energy Data Analytics Lab and the Sustainable Energy Transitions Initiative, the team will design two datasets: the first containing satellite imagery from diverse geographies with all energy infrastructure labeled, and the second a synthetic version of the same imagery. These data will enable research into whether synthetic imagery may be used to adapt algorithms to new domains. The better these techniques adapt to new geographies, the more information can be provided to researchers and policymakers to design sustainable energy systems and understand the impact of electrification on the welfare of communities. 

Faculty Lead: Kyle Bradbury

Project Manager: TBD


Duke Building Energy Use Report

Duke must reduce its energy footprint as Duke strives for Carbon Neutrality by 2024. To help this cause, a team of students will review troves of utility usage data and attempt to build an attractive and practical monthly energy use report for every building and school at Duke. This report will not only show historical usage but also develop an energy benchmark for comparison and conservation tips for local administrators to take action. Duke Energy has been using a similar report to encourage conservation at the residential level for years. It is time to bring energy use transparency to the broader Duke community and inspire action.

Faculty Lead: Billy Pizer

Project Manager: TBD


Identifying extreme events in wholesale energy markets

Tether Energy finances and operates various distributed energy resources operating in wholesale energy markets, ranging from solar panels to residential smart thermostats. Tether also does financial trading when it identifies arbitrage opportunities in these markets. One of Tether Energy's main operational risks is the very high volatility in wholesale real time (or spot) energy prices. Where stock markets consider a 30% change in price large, energy markets routinely face changes in price on the order of 300%. This high volatility comes from three main "shocks": 1. power demand changes, due to unpredictable weather, industrial patterns, or human consumption; 2. fuel shortages, driven by trade, extraction/exploration, and gathering/transportation economics; 3. electrical transmission outages, driven by operational failure, extreme weather events, and human behavior.

First, this project team will identify what should be considered an "extreme" price shock from 5-10 years of historical data in PJM. Second, the team will work to automatically identify potential causes for the rare events from news articles, public filings, and Tether's own structured data. Third, the team will build reasonable priors for the occurrences of these rare events, and incorporate potential covariance between the events using copulas or similar methods. Finally, the team will create a simple classifier such as logistic regression to predict the likelihood of a price shock on a given day. The model needs to be evaluated with a walk-forward backtest, training on about 3 years of data at a time, and shifting forward the training window in approximately one-month increments, to smooth out potential bias and overfitting in the model. 

Project Lead: Eric Butter, Tether Energy

Project Manager: TBD