# How to Create a Contingency Table in Pandas

## Introduction of crosstab() function

You can use the pandas.crosstab() function to create a contingency table. It computes a simple cross tabulation of two (or more) factors. The following is the sample data

``````# Generate a sample dataset
import pandas as pd
car_data = {'Brand': ['Brand 1', 'Brand 1','Brand 2','Brand 1','Brand 3'],
'Location': ['CA', 'CA', 'CA','NY','MA'],
'Number':[200,20,300,400,500]}
car_data=pd.DataFrame(data=car_data)
print(car_data)``````
```     Brand Location  Number
0  Brand 1       CA     200
1  Brand 1       CA      20
2  Brand 2       CA     300
3  Brand 1       NY     400
4  Brand 3       MA     500```

## Method 1: Create a contingency (frequency) table in Pandas

By default, `pandas.crosstab()` computes a frequency table of the factors.

``````result=pd.crosstab(index=car_data['Brand'], columns=car_data['Location'])
print(result)``````
```Location  CA  MA  NY
Brand
Brand 1    2   0   1
Brand 2    1   0   0
Brand 3    0   1   0```

## Method 2: Add margin sums to frequency tables

You can add margin sums by setting the parameter of `margins=True` in `crosstab()`.

``````result=pd.crosstab(index=car_data['Brand'], columns=car_data['Location'],margins=True, margins_name="Total")
print(result)
``````
```Location  CA  MA  NY  Total
Brand
Brand 1    2   0   1      3
Brand 2    1   0   0      1
Brand 3    0   1   0      1
Total      3   1   1      5```

## Method 3: Use “values” and “aggfunc” in crosstab()

In some situation, you might not want to count the freqncy, but rather like sum or mean of cell values. You can do that by using the `values` and `aggfunc` parameters in the `crosstab()` function.

``````result=pd.crosstab(index=car_data['Brand'], columns=car_data['Location'],values=car_data['Number'],aggfunc='sum')
print(result)``````
```Location     CA     MA     NY
Brand
Brand 1   220.0    NaN  400.0
Brand 2   300.0    NaN    NaN
Brand 3     NaN  500.0    NaN```

Besides `sum`, you can also use `mean` for `aggfunc` in the `crosstab()` function.

``````result=pd.crosstab(index=car_data['Brand'], columns=car_data['Location'],values=car_data['Number'],aggfunc='mean')
print(result)``````
```Location     CA     MA     NY
Brand
Brand 1   110.0    NaN  400.0
Brand 2   300.0    NaN    NaN
Brand 3     NaN  500.0    NaN```