How to Fix: Data must be 1-dimensional

You might encounter the following error when trying to convert Numpy arrays to a pandas dataframe. Exception: Data must be 1-dimensional 1. Reproduce the Error Output: Exception: Data must be 1-dimensional 2. Why the Error Happens It happens because pd.DataFrame is expecting to have 1-D numpy arrays or lists, since it is how columns within … Read more

How to Fix: if using all scalar values, you must pass an index

This tutorial shows how to fix the error when using Pandas. if using all scalar values, you must pass an index You encounter this error because you are trying to create a dataframe with all scalar values, but without adding index at the same time. Reproduces the Error Output: ValueError: If using all scalar values, … Read more

How to Combine Multiple Numpy Arrays into a Dataframe

This tutorial will show how you can combine multiple arrays (e.g., 2 arrays of X and Y) into a Pandas dataframe. The following summarizes the two methods. Method 1: pd.DataFrame ({‘X’:X,’Y’:Y}) Method 2: combined_array=np.column_stack((X,Y))pd.DataFrame(combined_array, columns = [‘X’,’Y’]) Two Examples of Combining Arrays into Dataframe Example for Method 1: In the following, we create two arrays, … Read more

How to Create Dummy Variable in Python

This tutorial shows two methods of creating dummy variables in Python. The following shows the key syntax. Method 1: Use Numpy.where() to create a dummy variable np.where(df[‘column_of_interest’] == ‘value’ ,1,0) Method 2: Use apply() and lambda function to create a dummy variable df[‘column_of_interest’].apply(lambda x: 1 if x==’value’ else 0) Example 1: Use numpy.where() to create … Read more

Calculate Means Group by Two Columns in Pandas (3 Examples)

The following provides 3 different methods of calculating means group by two Columns in Python. Method 1: df.groupby([“column_1″,”column_2”]).mean() Method 2: df.groupby([“column_1″,”column_2”]).agg(‘mean’) Method 3: pd.crosstab(index=df[‘column_1’], columns=df[‘column_2’],values=df[‘dv’],aggfunc=’mean’) Prepare the data Output: city store sales 0 City1 store1 10 1 City1 store2 20 2 City1 store1 20 3 City1 store2 50 4 City1 store1 30 5 City2 store2 10 … Read more

Outer Join in Pandas

Outer Join returns all records from both the left or right dataframes. When rows in one dataframe do not match another dataframe, the joined dataframe will have NaN for cells of the unmatched rows. We can use how=’outer’ in join() to outer join two dataframes in Pandas. The basic syntax is as follows, in which df_1 … Read more

Left Merge in Pandas Python

We can use how=’left’ tells merge() to left merge two dataframes. The following is the Pandas syntax, in which df_1 and df_2 are two dataframes to be merged. df_1.merge(df_2, how=’left’, left_index=True, right_index=True) Step 1: Prepare the data to be left merged The following is the two dataframes to be left merged. df_1: Brand Location a … Read more

How to Save Pandas Dataframe as csv file

To save Pandas dataframe as CSV file, you can use the function of df.to_csv. The following shows the steps. df.to_csv(“file_name.csv”) Step 1: Dataframe example The following Python code is to generate the sample dataframe. The following is the print out of the generated sample dataframe. df_1: Brand Location a Tesla CA b Toyota CA c … Read more

Pandas Left Join Two Dataframes

Introduction There are two methods to left join two dataframes in Pandas. Method 1 We can use how=’left’ tells join() to left join two dataframes. df_1.join(df_2, how=’left’) Method 2 We can use how=’left’ tells merge() to left join two dataframes. df_1.merge(df_2, how=’left’, left_index=True,right_index=True) Sample Data The following is the two dataframes to be left joined. … Read more