General Edition
18

General Edition Discussion

Replies (41)
Sort by:

If anyone has a great Cost_forecast but is struggling with the oprimization problem have a look at my optimization strategy:https://github.com/raubenheimer/Genetic-Opt-for-Shell-Hackathon-2023/tree/main

Regrettably, I wasn't able to reduce my Cost_forecast sufficiently to be competitive.

Hello everyone,

Please, I am looking to join some team If there is still place available in your groupe. my email: arolharlem@yahoo.fr. I am currentely at the end of the nanodegree AI programing with Python course.

 

Doubt about finding the supply chain elements especially finding the Overall cost:

  • Based on the formulae for finding out the cost of transportation, we need the quantity of biomass, number of pellets between location and dist between the locations.  But how to find the number of pellets?
  • And as we dont have True Biomass values for both 2018 and 2019 , how would we calculate the "Cost of biomass forecast mismatch"?

 

Any hint about how to reduce the dimensionality of the optimization problem? :)

Coming in quite late but if anyone is interested in collabing email me at zeph.qamar4@gmail.com. I am a beginner with some experience in Python ML and a little in qgis. 

Hello. How can I also apply to the startup edition after applying to the general edition?

Can someone help me with the datasets? I am unable to find it on the hackathon page,

Please follow the instructions given on the main page.

Hi there everyone! I am ready to form a team. Do reach out through my LinkedIn: https://www.linkedin.com/in/sohan-mishra-17b010227/

If your forecasted info is ready, here is the alto you nsider to find the cities you want to install J's at https://chat.openai.com/share/9fefae9c-8b1a-47b0-abfa-9e093fc0edce

Hello,

I'm Abhishek, having a background in Agriculture science(don't know if that'll help), familiar with basic concepts of ML & Python.

I'm a total beginner but highly motivated and want to participate and submit a relevant submission.

 

If anyone wants to collaborate and give it a try at least can contact me at my mail: abhi.kumar0799@gmail.com.

 

Mail me at - akumar3_be19@thapar.edu

Hello everybody,

I just entered this competition. If anyone wants to collaborate, please reach out to me

 

https://www.linkedin.com/in/yanteix/

I'm a total beginner, but highly motivated. If you want we can collaborate.PS. I have a bachelors degree in Agriculture.

I am a Data Scientist at Eindhoven University of Technology, Eindhoven, The Netherlands. I have 2 years of work experience as a Professional Data Scientist along with a Masters in Machine Learning and AI from Liverpool John Moores University, UK. I am open for collaboration. Please share your email address and contact details at nc2012@cse.jgec.ac.in

Hi Sir, I am interested to collaborate and join your team. my mail is arolharlem@yahoo.fr

I have some questions regarding the Datasets for the hackathon:

Firstly, In the distance matrix dataset, how can I understand which is a harvesting site, depot or biorefinery?

Secondly, What is the formula for calculating the demand-supply matrix of Biomass and Pellet? I have no clue how to combine the forecast of biomass availability and distance (supply-chain).

Thirdly, In the submission file, there will be 3 years: 2018, 2019 and 20182019. I do not understand the meaning of 20182019.

Please go through the detailed problem statement (provided on main page of the hackathon): https://apse1-uc.hackerearth.com/he-public-data/Detailed%20Problem%20Statementb5c7c96.pdf. You will get answers of all your queries in it.

I have very well gone through the PDF containing the detailed problem statement and these are the questions I have post that. If these questions are not answered properly, I will choose to drop out of this hackathon.

  • Let me try to address your queries: Note that the objective is to i) forecast the biomass for year 2018 and 2019 at harvesting sites and then ii) using your forecasted values to place supply-chain elements like depots and refinaries.
  • There are total 2418 locations. Each location is a harvesting site. You are free to place depots and refinaries at any location given that you follow the constraints. Distance matrix (2418 x 2418) is the distance from any source location to any destination location.
  • Demand-supply matrix of Biomass and Pellet is the solution you are supposed to provide. It will depend on your forecasted biomass and placement of your supply chain elements (depots and refinaries).
  • So in summary, forecasted biomass alongwith Biomass and Pellet demand-supply matrices are year dependent. Whereas, supply-chain elements (depots and refineries) are invariant with year. Hence, a special key 20182019 is used for depots and refineries location in solution.

So to clarify, it's solely the 2019 score that will determine whether each team is shortlisted or not?

Yes. but you will only be able to see your 2019 score (on private leaderboard) after the competition is over. During the competition, the public leaderboard shows your score for year 2018.

got it, thanks!

Hello!! This is Shrinath Dakare. I work as an Operations Research Scientist at Optym. I'm proficient in OR concepts, C# and python. Also, have knowledge of basic ML. Looking for team members. Please reach out to https://www.linkedin.com/in/shrinath-dakare-iitkgp/.

Hello Dakare, if you're still looking for teammates I'm available and looking for a team. 

I am a chemical engineer with basic knowledge in ML and DL, and a solid background in math and statistics.

Please let me know if you're interested in forming a team.  https://www.linkedin.com/in/eedx/

I didn't find any dataset not even under instructions

Instructions to access Data Set :
  1. Register to the hackathon 
  2. Create a team with team name(individual team is allowed)
  3. Back to challenge page 
  4. Start now 
  5. "Data set " is seen under Instructions , click on it to download the ZIP file.

Hi Massimiliano, I agree. The problem statement does not specify how the costs for both years will be aggregated. I.e. will the two costs just be summed, or will they be weighted differently? It would be nice to get some clarity about this.

Hi James, We have included following in the detailed problem statement. Sorry you coundn't find it. It's in the Notes section: a) Your solution will be eligible for ranking only if it satisfies all the constraints for 2018 and 2019. b) We will keep the first year (2018) of your solution for the public leaderboard. You can test your solution any time and see how it ranks.c) We will keep the second year (2019) of your solution for the private leaderboard and it willbe used to determine the finalists. So, cost (and hence leaderboard score) for year 2018 and 2019 will be calculated individually and used for public and private leaderboards, respectively.

Thanks for the clarification Harshil

Hi Guys, I am in for collaboration guys. I currently do not have any team, but I hope to join or form a formidable one. I am completing my MSc in AI and Data Science at Keele University, UK. I am also starting a role as A Data Scientist with the Leeds Institute of Data Analytics, University of Leeds as a Data Scientist. I am available on LinkedIn at hhtp://linkedin.com/in/sayo2rule, on Twitter @sayo2rule or by email at sayo2rule@yahoo.com.

hi, we can work together, https://www.linkedin.com/in/tejasva-soni-0a81121ab/

Hello. I am surching for a team too.

Hi, It's a pleasure if i can join your group. arolharlem@yahoo.fr

"Optimized supply chain infrastructure proposed in your solution must be the same for bothyear 2018 and 2019" But forecast errors for 2018 would be different from 2019 forecast so we have to optimize for BOTH year simultaneously?

IMHO the problem is not so clearly described.

Hi Massimiliano, i) you are supposed to forecast the biomass for year 2018 and 2019 and then ii) using your forecasted values place supply-chain elements like depots and refinaries. Of couse, biomass forecast for year 2018 and 2019 will be different. However, the supply-chain elements locations are constant irrespective of the year. So, idea is to design a robust supply-chain for a rapildly changing biomass-forecast.

Hello everyone, in the sample submission file there are no biomass_demand_supply,pellet_demand_supply . So what are those columns? 

Hi Massililiano, I think you mean you dont see them in the dataset (while they are expected in the sample file) .... I am not sure how we are expected to predict biomass_demand_supply & pellet_demand_supply if we were never provided with any figures.... 

You will need to optimize the supply-chain and estimate i) biomass_demand_supply: flow of biomass between harvesting sites and pre-processing depots ii) pellet_demand_supply: flow of pellets between depots and refineries. These quantities will be result of where you place the supply-chain elements (depots and refineries) and of couse will be different for each year based on the forecasted biomass of that year.

Hi Massimiliano and Kashani, adding on to what Harshil has mentioned above - 

  • The sample submission file does contain values for biomass_demand_supply and pellet_demand_supply. I would suggest taking another look at all the entries in column B i.e. 'data_type'. 
  • Kashani, Harshil's point directly answers your question. The prediction that you're challenged to do here is with estimating the amount of biomass that is available in the 2418 gridpoints for the years 2018 and 2019. You're provided with historical biomass values to help with this, and you may choose to incorporate other factors as you deem fit. Plotting the historical data as a scatter plot of biomass values corresponding to each gridpoint will help you with visualizing them as figures, if needed. The part about 'biomass_demand_supply' and 'pellet_demand_supply' is not prediction, but the flow of biomass that your supply chain estimates when following the constraints, as Harshil rightly describes it

 

Hope this helps!

?