1- Data Integration
integrate data from all sources to be in the same format
2- Encoding
encode every label to represent a logical fallacy
3- Balancing
balancing the number of records in each label in the dataset