Energy production
This project involves processing and analyzing energy data from multiple sources, including the United Nations, World Bank, and Sciamgo Journal. The dataset was prepared to be analyzed focusing on the most recent years (2006-2015) and the top 15 countries by research rank in the field of energy engineering and power technology. This project was implemented using Pandas library and the full analysis is available at https://github.com/Yossranour1996/Energy-Production.
Key Steps and Tasks:
Data Loading: The project begins by loading data from three different sources:
Energy supply and renewable electricity production data from the United Nations (Energy Indicators.xls).
Gross Domestic Product (GDP) data from the World Bank (world_bank.csv).
Journal contributions rank data from Sciamgo (scimagojr-3.xlsx).
Data Cleaning and Preparation:
Excluding header and footer information from the United Nations dataset.
Removing the first two columns, which are unnecessary.
Renaming columns to 'Country', 'Energy Supply', 'Energy Supply per Capita', and '% Renewable'.
Converting 'Energy Supply' to gigajoules.
Handling missing data by converting '...' to np.NaN.
Standardizing and renaming countries to match common naming conventions.
Removing parentheses and numerical digits from country names.
Data Integration:
Joining the cleaned Energy, GDP, and Sciamgo datasets based on the intersection of country names.
Data Selection:
Focusing on data from the last 10 years (2006-2015) for GDP.
Selecting the top 15 countries by Sciamgo rank.
Final Dataset:
Creating a consolidated dataset with 20 columns and 15 entries.
Setting the index as the country name.
Columns include relevant energy indicators, GDP data for the selected years, and research-related metrics.
Analysis and Insights:
The project enables further analysis of the energy indicators, economic factors, and research contributions of the top-ranked countries to investigate the energy production.
Reporting:
The outcome of this project is a well-organized and standardized dataset that facilitates comprehensive analysis and comparisons of energy-related trends among countries with a focus on the top performers.
The clean dataset can serve as a valuable resource for researchers, policymakers, and analysts in the field of energy and economics.
Skills:
#Pandas #EDA #Data Manipulation #Python #Git