Joaquín García de la Cruz’s Industry placement in the travel industry
Placements offer the opportunity to experience the workplace environment and how workplace practices differ from academic ones. They also provide the chance to apply your skills in a new setting, solving new problems alongside a new group of people. The LIV.DAT CDT incorporates a 6 month placement for its students, allowing them some industrial experience to implement their big data training and broaden their PhD experience.
At LIV.DAT, Joaquín García de la Cruz’s research focused on understanding the formation and evolution of Milky Way-like galaxies from both a theoretical and an observational point of view. In 2020, he spent 6 months with online travel company eDreams Odigeo where he worked within the Data Science department’s fraud team. This is an account of his experience.
eDreams Odigeo is an online travel company with almost 2000 employees, offering services that range from plain tickets to hotel reservations and car rentals. The company operates with worldwide databases containing thousands of airports, hotels, airlines, and so on. Therefore, data science is crucial in order to manage such vast amount of information. The data science group at eDreams Odigeo, based in the Barcelona office, is one of the biggest in the sector.
At the very beginning, Joaquín was given three projects from which he could choose one to work on. During the first couple of weeks, he spoke with a number of people to get a good idea of the relevant skills and timescales for these projects.
About his start, Joaquín said: “I was very excited to start working with them and see what I could learn in terms of data science but also the experience in a big and well stablished company. I really appreciated this freedom as to what I would learn and work on. In the meantime, I went through different inductions, and became familiar with the different procedures to access the different datasets.”
Joaquín opted to work with the fraud team within the Data Science department where his project focussed on using Big Data techniques applied to time series forecasting. A time series is any kind of variable that changes with time, and as such, it is possible to forecast how this variable is going to change with time in the future. In his project, the variable in question was the amount of fraud done to the company. Here, they look into bookings where money is being claimed back after the booking is made. This can be due to different reasons, such as the client got their bank card’s information stolen and a purchase was made without their consent.
What makes forecasting this variable particularly challenging, is that it takes a while for the events in the past to have an effect. For instance, after the booking, it takes some time for the client to realise it was not them who made the purchase. In addition, there is a legal process until the invoice reaches the company. Therefore, there is a lag between booking and claiming back that varies a lot with time, country, bank, etc. For example, you cannot model the series up until yesterday and start forecasting right away, because you have not received all the claims from past months yet. And if you start modelling at the time when all claims have been received, your forecast start to become unreliable. This is particular important as the uncertainty grows with the length of period for which you predict.
They used Machine Learning techniques for this project and combined information from full claims received as well as partial information from claims received in the pervious few months. Joaquín helped develop an algorithm that was able to predict with accuracy the money claimed to the company – In some cases an improvement of 50% of accuracy was achieved compared to previous forecasting models. This algorithm constrains a better upper limit to the losses the company could suffer due to fraud. This helps to make a more precise estimation of how much money the company needs to set aside for this and not invest elsewhere. It became especially important due to COVID-19, where millions of bookings were cancelled and money was claimed back.
Joaquín commented: “The best aspect of this placement is that I created a deliverable that I know the company will continue to use in the future and it will be very helpful. My placement had a positive impact on the company and will continue to do so in the future. In addition, I got to know different programming tools and coding libraries they have internally developed, as well as a real grounding in SQL which I found really valuable.”
“When looking at the corona pandemic, Spain went into lockdown around the halfway into the placement. This was not an easy situation to be in, but it could have been a lot worse. First of all, the support from the company and especially from my teammates was amazing. They made sure I was on the right track and kept helping me whenever I needed. Second of all, I was in my home country, and therefore I was able to navigate the situation way better than if I were in the UK. The saddest part was, I think, to finish the placement while still in lockdown. I could not say farewell properly to my teammates and every other colleague I met during the first months. Fortunately, after the lockdown lifted, I was able to meet some of them and say goodbye properly. This is a reflection on one of the things I liked the most about the placement: the environment and the people. Compared to my experience in academia, working in the private sector is way more collaborative and you spend less time by yourself wondering which line of code is causing havoc, or why the experiment did not work. If you are lucky to get into a company where there is such a supporting and collaborative environment, then the experience will be quite enjoyable and obstacles will be much easier to overcome.”