# Decision tree algorithm#

Decision trees continually split the dataset according to the parameters defined in the decision nodes. Decision nodes have branches coming out of them, where each decision node can have two or more branches. The branches represent the different possible answers that define the way in which the data is split.

Decision trees can handle both quantitative and qualitative features, considering that continuous features will be handled in ranges. Additionally, leaf nodes can handle categorical or continuous class labels; for categorical class labels, a classification is made, while for continuous class labels, the task to be handled is regression.

## Importing libraries and packages#

``` 1# Mathematical operations and data manipulation
2import pandas as pd
3
4# Model
5from sklearn.tree import DecisionTreeClassifier
6
7# Warnings
8import warnings
9
10warnings.filterwarnings("ignore")
11
12%matplotlib inline
```

## Set paths#

```1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"
```

The Fertility dataset aims to determine whether the fertility level of an individual has been affected by their demographics, their environmental conditions, and their previous medical conditions

```1dataset = pd.read_csv(f"{data_path}/fertility_Diagnosis.csv", header=None)
```

## Exploring dataset#

```1# Shape of the dataset
2print("Shape of the dataset: ", dataset.shape)
4dataset
```
```Shape of the dataset:  (100, 10)
```
0 1 2 3 4 5 6 7 8 9
0 -0.33 0.69 0 1 1 0 0.8 0 0.88 N
1 -0.33 0.94 1 0 1 0 0.8 1 0.31 O
2 -0.33 0.50 1 0 0 0 1.0 -1 0.50 N
3 -0.33 0.75 0 1 1 0 1.0 -1 0.38 N
4 -0.33 0.67 1 1 0 0 0.8 -1 0.50 O
... ... ... ... ... ... ... ... ... ... ...
95 -1.00 0.67 1 0 0 0 1.0 -1 0.50 N
96 -1.00 0.61 1 0 0 0 0.8 0 0.50 N
97 -1.00 0.67 1 1 1 0 1.0 -1 0.31 N
98 -1.00 0.64 1 0 1 0 1.0 0 0.19 N
99 -1.00 0.69 0 1 1 0 0.6 -1 0.19 N

100 rows × 10 columns

## Modelling#

```1X = dataset.iloc[:, :9]
2Y = dataset.iloc[:, 9]
```
```1model = DecisionTreeClassifier()
2model.fit(X, Y)
```
`DecisionTreeClassifier()`
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
```1# Testing by performing a prediction for a new instance with feature values
```['N']