-Identify the type of each attribute.

Attributes	Type
store_nbr	nominal
family	category
onpromotion	Discrete
sales	continuous (Target attribute)
date	interval
city	category
cluster	Discrete
(type) for store dataset	category
holiday	category
type_holiday	category
transferred	nominal (binary)
oil price	continuous
dcoilwtico	continuous
Local	category
Local name	category
Description	category

-Stores

The company owns 54 branches nationwide.
Not all branches were opened at the same time. An example of this is Store 22.
As shown in the figure, its total sales over a long periods are equal to zero, indicating that hasn't yet opened during that period.

Untitled

Will this affect the accuracy of the model? Yes Therefore, this issue will be addressed in preprocessing.

Untitled

These stores are distributed across 22 cities and 16 states.
After constructing the histogram for the 22 cities, it is evident that City Quito has achieved the highest profitability.

Untitled

Most of the other cities represent a similar sales distribution range of (1000 —> 15000).
This reminds us of using scaling in the data preprocessing stage.
Does this mean that the region or city affects the sales of each store? It is not necessary , especially if the increase in sales in a specific city or state is city or state to having the largest number of stores , as explained in the previous figure. (Quito —> 18 store —> 0.33)
This reminds us of using scaling in the data preprocessing stage.
The two shapes represent the sales of two stores in the same region, and as we can see, there is a significant difference in their distribution and the range of numbers.
Confirm my hypothesis:
The two shapes represent the sales of two stores in the same region, and as we can see, there is a significant difference in their distribution and the range of numbers. This indicates the presence of other factors that have a greater influence….