Data Science and Machine Learning Applied to Silicon Photovoltaic Solar Panels: Doing Energy Science at Scale with Time-series and Image Datasets

Event Date:
January 23rd 4:00 PM - 5:00 PM

Presented by Roger French, Kyocera Professor, Materials Science & Engineering; Director, SDLE Research Center; Faculty Director, Applied Data Science Program 

See French's Presentation 

Roger French

Abstract: Advances in computing, communication, and data collection have facilitated collection of petabyte-scale datasets from which data-driven models can be built. This digital transformation affects society, industry, and academia, since data-driven models can challenge how things are done and offer new opportunities for developing how things work.

At CWRU we have offered the university-wide Applied Data Science (ADS) program since 2015. The ADS program teaches non-computer science students, producing “T-shaped” graduates with deep knowledge in their domain plus strong data science skills. The ADS program provides both an undergraduate minor and graduate level courses for which a University Certificate is being developed. ADS students learn the foundations: coding, inferential statistics, exploratory data analysis, modeling and prediction, and they complete a semester long data science project for their ADS portfolio. The courses are taught using a practicum approach, with an open data science toolchain consisting of R, Python, Git, Markdown, Machine Learning, and TensorFlow on GPUs.

We utilize data science and big-data analytics to address critical problems in energy science. As solar power grows, we need to fully understand and predict the power output of photovoltaic (PV) modules over their entire > 30 year lifetimes. Degradation science [reference 1] combines data-driven statistical and machine learning with physical and chemical science to examine degradation mechanisms in order to improve PV materials and reduce system failures. We use distributed and high performance computing, based on Hadoop2 and the NoSQL Hbase, to ingest, analyze, and model large volumes of time-series datasets from 3.4 GW of PV power plants [reference 2]. We have developed an automated image processing and deep learning pipeline applied to electroluminescent (EL) images of PV modules to identify degradation mechanisms and predict their associated power losses [reference 3]. Unbiased, data-driven analytics, now possible using data science methodologies, represents a new front in our research studies of critically important and complex systems.

References

1. R.H. French, et al., Degradation science: Mesoscopic evolution and temporal analytics of photovoltaic energy materials, Curr. Op.Sol. State & Matls. Sci. 19 (2015) 212–226.

2. Y. Hu, et al., A Nonrelational Data Warehouse for the Analysis of Field and Laboratory Data From Multiple Heterogeneous Photovoltaic Test Sites, IEEE Journal of Photovoltaics. 7 (2017) 230–236.

3. A. M. Karimi, et al., Automated Pipeline for Photovoltaic Module Electroluminescence Image Processing and Degradation Feature Classification, IEEE Journal of Photovoltaics. (2019) 1–12.