Pandas Interview Questions and Answers
Freshers / Beginner level questions & answers
Ques 1. What is Pandas?
Pandas is an open-source data manipulation and analysis library for Python.
Ques 2. How to import Pandas?
You can import Pandas using the statement: import pandas as pd
Ques 3. What is a DataFrame?
A DataFrame is a two-dimensional, tabular data structure in Pandas.
Ques 4. How to create a DataFrame in Pandas?
You can create a DataFrame using the pd.DataFrame() constructor.
Example:
df = pd.DataFrame({'Column1': [1, 2, 3], 'Column2': ['A', 'B', 'C']})
Ques 5. How to select specific columns in a DataFrame?
You can select specific columns using double square brackets: df[['Column1', 'Column2']]
Ques 6. What is the purpose of the describe() function in Pandas?
describe() provides summary statistics of numeric columns in a DataFrame.
Example:
df.describe()
Ques 7. Explain the concept of broadcasting in Pandas.
Broadcasting allows operations between arrays of different shapes and sizes.
Example:
df['Column'] = df['Column'] * 2
Ques 8. Explain the purpose of the crosstab() function in Pandas.
crosstab() computes a cross-tabulation of two or more factors.
Example:
pd.crosstab(df['Factor1'], df['Factor2'])
Ques 9. How to handle categorical data in Pandas?
You can use the astype() method to convert a column to a categorical type: df['Category'] = df['Category'].astype('category')
Ques 10. Explain the use of the nunique() function in Pandas.
nunique() returns the number of unique elements in a column.
Example:
df['Column'].nunique()
Ques 11. What is the use of the nlargest() function in Pandas?
nlargest() returns the first n largest elements from a series or DataFrame.
Example:
df['Column'].nlargest(5)
Ques 12. How to convert a Pandas DataFrame to a NumPy array?
You can use the values attribute: df.values
Intermediate / 1 to 5 years experienced level questions & answers
Ques 13. What is the difference between loc and iloc in Pandas?
loc is label-based indexing, while iloc is integer-based indexing.
Ques 14. How to drop a column in a DataFrame?
You can drop a column using the drop() method: df.drop('ColumnName', axis=1, inplace=True)
Ques 15. Explain the use of groupby() in Pandas.
groupby() is used to group DataFrame by a column and perform aggregate functions.
Example:
df.groupby('Column').mean()
Ques 16. What is the purpose of the apply() function in Pandas?
apply() is used to apply a function along the axis of a DataFrame.
Example:
df['Column'].apply(lambda x: x*2)
Ques 17. How to filter rows in a DataFrame based on a condition?
You can use boolean indexing to filter rows based on a condition: df[df['Column'] > 10]
Ques 18. What is the purpose of the pivot_table() function?
pivot_table() is used to create a spreadsheet-style pivot table as a DataFrame.
Example:
pd.pivot_table(df, values='Value', index='Index', columns='Column', aggfunc=np.sum)
Ques 19. How to handle duplicate values in a DataFrame?
You can use drop_duplicates() to remove duplicate rows: df.drop_duplicates()
Ques 20. Explain the purpose of the iterrows() function in Pandas.
iterrows() is used to iterate over DataFrame rows as (index, Series) pairs.
Example:
for index, row in df.iterrows(): print(index, row['Column'])
Ques 21. Explain the use of melt() function in Pandas.
melt() is used to reshape or transform data by unpivoting it.
Example:
pd.melt(df, id_vars=['ID'], value_vars=['Var1', 'Var2'])
Ques 22. What is the purpose of the to_csv() method in Pandas?
to_csv() is used to write a DataFrame to a CSV file.
Example:
df.to_csv('output.csv', index=False)
Ques 23. How to calculate correlation between columns in a DataFrame?
You can use the corr() method: df.corr()
Ques 24. Explain the purpose of the get_dummies() function in Pandas.
get_dummies() is used for one-hot encoding categorical variables.
Example:
pd.get_dummies(df['Category'])
Experienced / Expert level questions & answers
Ques 25. Explain the use of merge() in Pandas.
merge() is used to combine two DataFrames based on a common column.
Example:
pd.merge(df1, df2, on='common_column')
Ques 26. How to handle missing values in a DataFrame?
You can use methods like dropna() to remove missing values or fillna() to fill them with a specific value.
Example:
df.dropna() or df.fillna(value)
Ques 27. How to rename columns in a DataFrame?
You can use the rename() method to rename columns: df.rename(columns={'OldName': 'NewName'})
Ques 28. Explain the concept of MultiIndex in Pandas.
MultiIndex allows you to have multiple index levels on an axis.
Ques 29. How to handle time series data in Pandas?
Pandas provides the Timestamp type and functions like resample() for time series analysis.
Example:
df['Date'] = pd.to_datetime(df['Date'])
Ques 30. Explain the purpose of the cut() function in Pandas.
cut() is used to segment and sort data values into bins.
Example:
pd.cut(df['Values'], bins=[0, 10, 20, 30], labels=['<10', '10-20', '20-30'])
Most helpful rated by users:
Related interview subjects
PyTorch interview questions and answers - Total 25 questions |
Data Science interview questions and answers - Total 23 questions |
SciPy interview questions and answers - Total 30 questions |
Generative AI interview questions and answers - Total 30 questions |
NumPy interview questions and answers - Total 30 questions |
Python interview questions and answers - Total 106 questions |
Python Pandas interview questions and answers - Total 48 questions |
Python Matplotlib interview questions and answers - Total 30 questions |
Django interview questions and answers - Total 50 questions |
Pandas interview questions and answers - Total 30 questions |
Deep Learning interview questions and answers - Total 29 questions |
PySpark interview questions and answers - Total 30 questions |
Flask interview questions and answers - Total 40 questions |