Shell.ai

Shell.ai Hackathon

Challenge Over

Technical/ Problem Statement

3 years ago

6

Technical/ Problem Statement

Replies (23)

0

hi where can I get the dataset I would still like to participate although the contest is over

3 years ago

0

Hey, Once the challenge is over, we can not share the dataset pertaining to this challenge. However you can participate in HackerEarth's parctise tests.

3 years ago

0

Can you share a link to the practice tests? It seems the competition is no longer accepting submissions.

3 years ago

0

Hi Harshal,

I have a small doubt regarding evaluation of the solutions , the leaderboard shown to us is using the result for year 2019 and private leaderboard will use the result for year 2020. is it necessary that the one who is at 24th position in social leaderboard their ranking wil also be 24th in provate leaderboard ? or it can imporove/decline also? is there anyway to see private leaderboard?

Regards

Shahrukh

3 years ago

0

Private leaderboard can have different ranking as your solution is evaluated on year 2020. Having said that, if you have a good solution for 2019 (public leaderboard), you would also have a good solution for year 2020 (private leaderboard) as well since the nature of the problem is exactly same. Private leaderboard will be made available to view after the Hackathon ends.

3 years ago

Mohammed Suhail

0

Hi Harshal,

we have observed 30 demand points as all zero demands for all previous years(2010-2018) , would you like us to continue this for 2019 and 2020 with zero value or should we forecast it using our logic, so we have clean data for forescasting?

3 years ago

1

Hi Mohammed, as a moderator I can't comment on this. You can use your best judgement and progress.

3 years ago

Mohammed Suhail

0

For the submission in leaderboard, upload prediction file, we can upload only 1 file, in the sample, we have only SCS, FCS & DS_ij . How about the demand D_i values for 2019, we need to upload that also right? should we do it in 2 sheets on the same excel, or do we need to do it without D_i 2019 values? kindly advice

3 years ago

0

Hi Mohammed, D_i values are calculated by evaluation script using constraint No. 6.

3 years ago

0

hello @harshil patel ,

I had a doubt that what is value of supply is it sum of scs n fcs or else smax itself in ds matrix

3 years ago

0

Hi Vedang, Can you please elaborate your question? Apologies for not understanding.

3 years ago

1

thanks harshil, i have got the answer actually i had asked wrong question sorry for that

3 years ago

0

welcome Vedang :)

3 years ago

0

Hello, I am currently encountering an issue where I am constantly getting a constraint 5 violation. I have checked that the total demand to each supply point is either equal or less than the available supply. Has anyone else encountered this, if so any suggestions for fixes?

3 years ago

0

As a moderator, I can advise you to check the definition of constraint 5 again. Specifically: 1) how Summation is happening in D_ij matrix and 2) How Smax_j is calculated. More details in the problem statement. If your problem still persists, you can mail your solution file to HackerEarth and we will examine it.

3 years ago

0

Hi Johnny, one more point: ensure that constraint 5 is followed by solutions of both year 2019 and 2020 independently.

3 years ago

0

Hi Harshil, many thanks for your suggestions! I have ensured that the total demand (across all 4096 demand points) to each respective supply point does not exceed the available supply given by 200*SCS+400*FCS and that this does not violate for 2019 and 2020. This still results in a constraint 5 violationI have however reduced the total demand to each supply point so that there is an miniscule excess of supply (order of 1E-5), and hackerearth accepts my solution.This however, does not seem methodical and robust, so I have emailed support@hackerearth.com with my solution file for further checks on your end.Does the site only handle floats up to a certain significant figure, which possibly contributig to a rounding error on your end?

3 years ago

1

Preferential Bias towards Underestimated Demands

Dear Organizers,

I believe the EV problem as constructed is "biased" towards the underestimation of demand. Please allow me to elaborate.

Let us assume that the actual demand for the year 2019 is D19_true. We have two hypothetical forecasts D19_over (overestimation) and D19_under (underestimation). Further assume: MAE(D19_true - D19_over) = MAE(D19_true - D19_under). In other words, Cost_DM is identical for both the demand predictions.

Now, in the case of D19_over scenario, one will need more "supply" to balance the excess demand. That means Cost_IF_over will always be higher than the Cost_IF_under case. Moreover, due to more demand, one needs to transport more "materials" to achieve optimal transport. Thus, Cost_CD_over will also be more than Cost_CD_under.

Thus, the overestimation of demand leads to more overall cost. In other words, identical Cost_DM can lead to significantly different "scores" due to over/under estimation; underestimation scenario will always lead to "better" scores. I would like to emphasize that I am assuming Cost_DM is the same for under and over estimated cases.

From a business perspective, underestimation case is more lucrative. However, overestimation is better from a consumer point of view.

I believe you can make the playing field level, if you use asymmetric cost for Cost_DM. You can penalize underestimation more than overestimation. Since you have access to the true demand, you can figure out the exact penalty. Changing Cost_DM does not impact the participants' coding.

Best regards,

Sukanta

3 years ago

0

Wow what a finding. Agree

3 years ago

1

Hi Sukanta, Great observation! I must appriciate your understanding of the problem. To be true, this is also a real dellima for the EV charging business: to maximize the profit or to maximize the margines. In the Demad underestimation case, your designed EV infra will be fully utilized at all time and you will maximize on margins. But in this case, some of your customers will be dissatisfied and you may lose on getting potential profits (albeit at lower margines if you would have installed some more capacity). In overestimation case, your EV infra may be underutilized a bit but your customers will be happy and there is an opportunity to maximize the profilts at a bit lower margines. so, the real matric to focous here is Net Present Value (NPV) along with margines and profilts. However, it is outside of the scope of current competition. For this competition, let's play by the already set rules. Once again, a great observation!

3 years ago

Abhishek kotcharlakota

0

Hello. I have a question about Cost of Customer Dissatisfaction. You said that this cost is calculated by multiplying the distance matrix with the demand-supply matrix. Here, Distance Matrix is the distances from EVERY demand point to EVERY supply point. Why should demand(i) care about every supply(j) in the region? As long as there are supply(j) points close to demand(i) and have enough capacity, there's no need to calculate distance and dissatisfaction to ALL supply(j) points in the entire region. Please tell me why you chose to define objective function this way.

3 years ago

0

'Close to' is a subjective term. We don't know how close are these supply points from the demand points unless we develop a solution satisfying all constraints. So to convert the subjective 'Close to' to an objective value, we have addeded a cost of customer dissatisfaction. Think about this: If we don't have this cost, your optimization algo doesn't have an incentive to obtain a solution where these supply points are indeed 'close to' the demand points.

3 years ago

0

Hi All,

I am unable to find the datasets that are to be provided. Please help me with the navigation for these datasets

3 years ago

1

You will need to 1) registed for this challenage, 2) create/join team and 3) start the challenge. You will then be redirected to data/submission/leaderboard page from where you can download data.

3 years ago

Mohammed Suhail

1

For calculating the cost, kindly help us to understand the Cost of Demand Mismatch, in which, how we will access Dtrue,i for calculating the cost.

3 years ago

0

D_History is given to you from year 2012 to 2018. D_True for year 2019 and 2020 is not given (hidden from you). You have to forecast Demand (D_Forecast) for year 2019 and 2020 as close as D_True. You actually can't compute the Cost of Demand Mismatch on your side as D_True is hidden from you. We do it when you submit your solution.

3 years ago

Mohammed Suhail

0

thanks, Harshil, for nicely explaining.

3 years ago

Abhishek kotcharlakota

0

Harshil, D_Forecast is not a part of the solution submission(you are only asking DS, SCS, FCS from us) . SO how will you even compute cost of Demand Mismatch?

3 years ago

0

I guess the D_Forecast is predicted based on the DS matrix which we submit. So for ex for a demand point 'i' it will be sum over j DS[i][j].

3 years ago

0

As mentioned by Kirushikesh DB, D_Forecast_i is computed using DS_ij using constraint 6 by us.

3 years ago

1

What float precision is used to evaluate the DS Matrix constraints?

3 years ago

0

Can you please inform us which constraint are you referring to?

3 years ago

Mohammed Suhail

1

Hi Harshal,

Thank you for the explanation. I would request a small clarification.

Di - EV Charging demand at i th point.

DMij - Demand Supply Matrix ( It says how much demand of each demand point is supplied by each supply point.

Here Di is understood well with values from the provided file, Demand_ History.csv

And understood DMij is the Demand supply matrix we need to create ourselves.

THE QUESTION IS: WHAT VALUE (NUMBERS) WILL DMij WILL HAVE? CAN YOU GIVE AN EXAMPLE WITH SPECIFYING ONE i Point and j POINT WITH NUMBERS TO UNDERSTAND WELL?

This explanation will really help us to completely understand the entire concept.

3 years ago

1

Hi Mohammed, Please go through the webinar video (Q&A section at the end) where we have addressed this question. Also, suggest you to revisit last two constraints to understand it better.

3 years ago

Mohammed Suhail

1

Thank you Harshil for the advice. Let's say for any ith point, Demand be Di = D0 = 13.1195 and there are 100 parking slots the demand will be accomplished, we know the nearby parking slot will be responsible for accomplishing this demand maximum, for far parking slots, the values will be zero. can you give one example in detail, how it is being done considering data given for 2018, for Di and parking Slot (No. of FCS &SCS)? This is to understand it well.

3 years ago

1

Technically: Demand at i_th demand point can be satisfied by ANY j_th supply point. Ideally: Satisfy the demand by NEARBY supply point given that all constraints are satisfied. Factually: demand will be AUTOMATICALLY satisfied by nearby points if you minimize the cost function. That's how the cost function is formulated.

3 years ago

1

Hello everyone,

I and my team are facing issues predicting the demand values for 2019 and 2020 due to the very few number of rows (i.e 2010 to 2018) given. I suspect we are using a wrong approach, and I'll appreciate if anyone suggest a data preparation method that works.

3 years ago

0

Hi Yusuf, As a moderator I would refrain from answering this question to be fair to everyone. Fellow participants, please help Yusuf.

3 years ago

1

Hello Yusuf try to use some time series forecasting method or other Machine Learning Algorithms i hope these 9 points are enough to predict the demand on 2019 and 2020 cause i got a good result of using one of those.

3 years ago

0

Thank you Kirushikesh

3 years ago

Hardikkumar Zalavadia

1

Hi, the second cost function is the mismatch between predicted and true demand forecasts. So without the true forecasts available, how shall we take that cost function into consideration for computing the overall cost for years 2019 and 2020?

3 years ago

View all replies (2 more)

0

D_History is given to you from year 2012 to 2018. D_True for year 2019 and 2020 is not given (hidden from you). You have to forecast Demand (D_Forecast) for year 2019 and 2020 as close as D_True.

3 years ago

0

Hello Harshil,So how can we include "Cost of Demand Mismatch" in our cost function for optimizing our model?

3 years ago

0

Hi Deep, It's totally upto you on how do you want to formulate your problem mathematically. I can't comment on the specifics.

3 years ago

0

hi Harshil

I find in sample submission file that all demand point index values for data type SCS and FCS are NULL does it all ways needs to be like that or should we insert some values in future submissions?

3 years ago

0

Hi Kunisetty,Yes. SCS and FCS are the quantities related to the supply point. So, demand point index will be disregarded for them while evaluating your submission. Best to keep demand point index emply/blank for SCS and FCS.

3 years ago

3

Hi,

In your evaluation script, how is the distance function defined? Is it Eucledian (L2) or Taxi-cab (L1)?

3 years ago

1

Hi Sukanta, We have considered direct/Eucledian distance. Great question and thanks for clarifying it for other participants as well.

3 years ago

danielgbenga814

0

Hello. Please, what evaluation script are you reffering to?

3 years ago

0

hey... I am unable to find the dataset mentioned in the problem_statement.pdf

can anyone help me out??

3 years ago

0

You will need to 1) registed for this challenage, 2) create/join team and 3) start the challenge. You will then be redirected to data/submission/leaderboard page from where you can download data.

3 years ago

Dharmendra kumar

2

How will we get Dtrue value , is it given?

3 years ago

1

D_History is given to you from year 2012 to 2018. D_True for year 2019 and 2020 is not given. You have to forecast Demand (D_Forecast) for year 2019 and 2020 as close as D_True.

3 years ago

DEMBELE MAMADOU

0

Hi guys. I think I understand the problem. Does anyone know what class of problems this challenge belongs to. I mean what kind of algorithms we use to solve this kind of problem. I know (basic) graph theory algorithms, machine learning models (Random forest, SVM, ...), but I can't match my algorithms to this problem.

3 years ago

2

Hi dimbos1997, the is a constrained optimization problem. You can read more on it here: https://en.wikipedia.org/wiki/Constrained_optimization

3 years ago

Santhosh Sankar

0

So this is not a machine learning problem. Please change the title in HackerEarth website.

3 years ago

0

Please solve the following queries

What doest DS represents and how it can be calculated?
Is the supply_point_index and demand_point_index represents same thing. If not then how to decide the coorelation between the demand and supply.

3 years ago

1

Hi Rajan,

I will answer your second question first.

2) Supply point is the (public/private parking) locations where more EV charging stations can be potentially installed to cater to the increasing demand year-on-year. There a are total 100 supply points for this problem. Index and cordinates of which are already given. Demand point is the center (representative) point of the grid where the demand of that grid area is aggregated. We have considered 64X64 (total 4096) demand points for this problem. Index, cordinates and historic demand of each demand points is also shared with you.1) Consumers from i_th demand point can satisfy their EV charging demand from any j_th supply location. So, DS matrix represets how much of a demand from i_th demand point is catered by j_th supply point. Naturally, it would be 4096X100 size matrix.

3 years ago

1

How is DS calculated?

3 years ago

1

What is DS_ij?

Consumers from i_th demand point can satisfy their EV charging demand from any j_th supply location. So, DS_ij matrix represets how much of a demand from i_th demand point is catered by j_th supply point. Naturally, it would be 4096X100 size matrix.

How DS calculated?

DS is the result of the optimization problem that you are suppose to formulate and solve.

3 years ago