IMDB - Dataset Analysis - Basic
Project 1: Explanatory Data Analysis & Data Presentation (Movies Dataset)
Project Brief for Self-Coders
Data Import and first Inspection
Import Necessary Libraries for this Task
Read the Movie Data

Getting Info About Data
Statistical Summary
id
budget_musd
revenue_musd
vote_count
vote_average
popularity
runtime
cast_size
crew_size

The best and the worst movies...
The Best and Worst Movies ever
Filtering Columns responsible to determine best and worst movies
Create a column 'profit_musd' (revenue - budget)

Create a column 'return_musd' (revenue/budget)

Rename Columns in Something Meaningful to present it later in Graphs

Set Title as Index
Convert Our DataFrame into HTML (Poster , Title , Popularity')
Highest Rated Movies

Movies With Highest ROI

Create a Function to find Best and Worst Movies
Top 5 - Highest Revenue

Top 5 - Highest Budget

Top 5 - Highest Profit

Top 5 - Highest ROI

Top 5 - Lowest Profit

Top 5 - Most Popular

Find Your Next Movie

Filtering Genres (Science Fiction and Action)
Filtering Bruce Willis Movies
Filtering
Movies With Uma Thurman and Quentin Tarantino
Most Successful Pixar Movies from 2010 to 2015 (Highest Revenue)
Action Or Thriller Movie with Original Language English with minimum rating of 7.5(Most Recent)
Filtering Genre (Action Or Thriller)
Filtering Language
Filtering Vote (greater than 10)
Filter Average Rating
Filter:
Most Common Words in Titles and Taglines


Are Franchises More Successful ?
All Franchises
Count Franchise/Standalone Movies
Revenue (Franchise Vs Standalone Movies)
Budget (Franchise Vs Standalone Movies)
Average Rating (Franchise Vs Standalone Movies)
Popularity (Franchise Vs Standalone Movies)
Return Of Investments (Franchise Vs Standalone Movies)
Aggregate Functions

Most Successful Franchise ?
Largest Franchise


Highest Revenue
title
revenue_musd
budget_musd
roi
vote_average
popularity
vote_count
Can you do it with nlargest ???
Highest Average Revenue

Most Expensive Franchises (Budget)

Highest Rated Franchises

Most Successful Directors
Most Number Of Movies (top 5)
Highest Revenues By Directors
Highest Number of Franchises directed by Directors
Aggregate Functions
Highest Rated Movies
title
vote_count
vote_average
To find succesful director in any specific genre i.e. Action
To Find Successful Actors
Set id as index
Split Actor Names to a DataFrame

Convert Series to DataFrame

Rename column label from 0 to 'Actor'
Merge Dataframe with Actors DataFrame

Number of Unique Actors
Actors with highest number of movies

Actors who have acted in more than 10 films

Highest Revenue
Highest Number of Films
Highest Rating
Popularity
Find Common Actors in the top lists
Concat all the dataframes

Find Duplicate Records of Actors

What are the most successful/popular genres? Has this changed over time (e.g. 80ths vs. 90ths)?
Merge gen and original dataframe
Aggregate Functions
Genre With Highest Revenue
revenue_musd
vote_average
popularity
Genre With Highest Rating
revenue_musd
vote_average
popularity
Genre With Highest Popularity
revenue_musd
vote_average
popularity
Highest revenue generated by Genre in 90's

Popularity of Genres in Nineties
Try to find Total revenue and average rating too.
Highest revenue generated by Genre in 20's

Popularity of Genres in Twenties
Find Most Successful Production Companies On your own ?
Last updated










