Interview Assessment

file-archive
502KB

Download the above zip file and extract it Somewhere. After extracting, you must have found two excel files and a pdf. Refer PDF for your assessment.

Task1 :

Import the Excel File (Solar energy.xlsx) :

chevron-rightSolution hashtag
solar_data = pd.read_excel(fr'{path}\Solar_Energy.xlsx')
solar_data 

Clean the Data :

Let us clean empty rows and duplicate values if any.

chevron-rightSolutionhashtag
solar_data.dropna(inplace=True)   #drop NA Values
solar_data.drop_duplicates(inplace=True)  #drop Duplicates
solar_data.reset_index(drop=True,inplace=True)  #Reset Index
solar_data

Find Total Volume of the Articles:

We can use shape attribute to find total volume of the articles.

chevron-rightSolutionhashtag

Find the total number of unique authors in the articles:

chevron-rightSolution hashtag

In how many articles is the word "solar" (case insensitive) mentioned?

chevron-rightSolutionhashtag

Convert articles in lowercase format :

Convert all articles in lowercase so that "solar" keyword does not miss.

Conditional Formatting :

We can use str methods to find "solar" keyword.

Find the total number of articles each author has written:

We can groupby author names and count their body of the article.

chevron-rightSolution hashtag

Add 2 columns and extract the month and year from date :

chevron-rightSolution hashtag

Add another column and extract Domain from the URL:

chevron-rightSolution hashtag

Task 2:

chevron-rightHint : Approach to Solve the Problemhashtag

Combine Multiple Excel Worksheets In a Single DataFrame

Last updated