Data driven complexity reduction of power system production cost models
With increasing amounts of intermittent renewable energy sources in today's grid, traditional long term capacity expansion planning models require an external production cost model to ensure that the flexibility requirements are met. However running a full year Production Cost Model is computationally intensive involving billions of constraints and variables. An efficient way to solve this problem is by selecting the best possible set of representative days for a whole year that best represents the load, wind and solar conditions for the whole year. Several techniques and metrics to select and validate the choice of representative days have been proposed in prior literature. However, most of them are heuristic in nature and lack a mathematical or statistical validation. In this work we try and develop a formal algorithm to select the representative periods by reducing the dimension of the netload data and using statistical metrics to find the optimal number of clusters. We then validate the choice of days chosen by external metrics and also the results from running the Production Cost model by scaling up the results of the representative days implementation. We observe and analyse the differences in the results.