When you plan a vacation, most of the time you tend to spend some time preparing the trip and seeing which sites you would be interested in visiting, as well as reviewing and reviewing all the information on the most interesting sites for days.
On the other hand, once you are at your destination, you tend to be interested in the history of the place, for this, you buy a guide or visit the city through a “free-tour” to be able to enjoy it in the best possible way.
What if Artificial Intelligence could help you throughout this process, and be your ideal companion for your next trip?
Thanks to the challenge posed in the second edition of AI Saturdays, our group formed by Javier Perez Chao, Alberto Granero García y Pablo Mateos Masa, We set ourselves the challenge of carrying out this idea, trying to provide a feasible, practical, and real solution in the world of tourism under the prism of Machine Learning and Combinatorial Optimization, under the treatment of Big Data processing techniques.
First steps towards the AI-based touristic assistant
Faced with such a challenge, the first thing we did was focus on what our tour assistant was going to cover and what parts it was going to contain. Due to the limited time we had, we had to focus the project in the city of Madrid, due to the easy access to data, thanks to the City Council’s open data, and on the other hand, due to our knowledge of the city due to the fact that we live in.
Once the focus was fixed, we established as our main mission the creation of an environment capable of saving time and providing information to any type of tourist who wants to visit a city. The intention is that, in just a few seconds, an optimized tourist route can be generated that is 100% adapted to the type of user profile, be it cultural, gastronomic, leisure, sports, or economic.
For this reason, the creation of an app based on a point of interest recommender based on user tastes and the relevance of these points of interest according to other tourist websites was proposed. Once the list of the places that best fit the user has been obtained, the next step is the
construction of an optimal route under the parameter of the available time.
Finally, the user will be able to consult the information about the points of interest of the suggested route by taking photos of said points through DL-based image recognition.
Because this part was the cornerstone of the project, it was decided to start with the recommender, since it was a fundamental part for the other two legs of the project.
Therefore, an arduous task of data collection was carried out both from the open data provided by the City Council, as well as information extractions from different tourist websites, to reinforce our database and complement information.
Because it was necessary to segment the points of interest according to some themes, one of the main points in the collection of information was the collection of descriptions of the points of interest. In other words, the solution to the problem was to use methodologies of recommendation systems based on Content-Based Filtering, in order to cluster the POIS.
For this, the final proposed solution was the application of different NLP techniques to preprocess the descriptions of the POIS, and then apply an LDA model to search for the topics that the POIS combined under a series of keywords.
Finally, thanks to this identification of the most relevant topics and words, the end-user will be returned a list of the most favorable points of interest according to their filter selection using a statistical similarity metric.
Thanks to this, we can address the problem of “cold-start” that recommendation systems based on Collaborative-Filtering have, due to the lack of initial information that is available from anonymous or new users.
Once the most relevant points of interest for the user have been listed, we move on to the optimization model for the generation of routes.
This part has been quite difficult, due to the great infinity of solutions available in the field of combinational optimization.
Because our case is a route optimization problem, we looked for those optimization problems that, solved by means of metaheuristic algorithms, would fall within the framework of our project.
Investigating, we saw that our problem was framed within the set of routing problems, more specifically those based on TSP (Traveling Salesman Problem).
TSP (Traveler’s Problem) is one of history’s most famous problems: In the days when salespeople traveled door-to-door vacuum cleaners and encyclopedias, they had to plan their routes, from house to house or city to city. The shorter the route, the better.
Finding the shortest route that visits a set of locations is an exponentially difficult problem — finding the shortest route for 20 locations is much more than twice as difficult as 10 locations.
To do this, optimization techniques are needed to intelligently search the solution space and find near-optimal solutions.
Mathematically, this problem can be represented as a graph, where the locations are the nodes and the edges (or arcs) represent the direct travel between the locations. The weight of each edge is the distance between the nodes. The goal is to find the way with the smallest sum of weights.
In our case, the main objective is to visit the maximum number of possible points in the time indicated by the user.
For this reason, we use the OR-Tools library provided by Google. This library has a specific API for routing problems that fit our problem. This API has a Solver based on local minimum search optimization techniques with metaheuristics.
Our optimizer collects the times it takes between recommended points and generates two matrices depending on the medium (foot or meter) that is used. Once these two matrices have been obtained, a search is made on which are the most optimal means between points to choose between underground or on foot, and it is entered into the Solver of the optimizer. This will give us the most optimal route.
As a general rule, in this type of problem, the optimizer solution must go through all the points in the search for the optimal route to achieve a feasible solution. We have added two restrictions and two clearances to achieve a flexible solution, and allow it to always comply with the restriction of the available route time and to be able to discard POIS in order to comply with the time required by the user. This allows us to be able to give the best optimal route and to visit as many POIS as possible.
Therefore, the first is to restrict the travel time based on the user’s preferences, and the second is to penalize those points that do not conform to said restriction, withdrawing from the solution to make the problem feasible.
Finally, we find the point of interest recognizer through artificial vision. The main reason for this part was the search for the added value of the tool that you want to give the user. This was motivated by two premises: the first there is nothing in the market today and AI is the main part of the solution.
With this, it is also sought to be able to emulate the figure of guide workers by a virtual assistant. With this last part, the circle of our application is closed reaching the complete wizard.
For the recognition of points of interest, the solution was to make a MultiLabel classifier, where each POIS would be a unique label. This classifier has been modeled under the convolutional deep network infrastructure (VGGNet16) using the Keras library.
Once the point of interest is recognized, the image recognizer will return the label associated with the POI and by reading the database, we can provide the information to the user.
From our point of view, this project involves laying the first stone in the construction of an innovative solution in the online tourism market, with long projections but feasible in terms of productivity.
In addition, due to the great magnitude of the project and the wide range of opportunities, our solution has a clear growing and scalable trend based on future lines of work such as other cities, new data sources, climatology, new transport, integration with museums, more optimal remodeling or integration with RRSS.
On the other hand, the online tourism market is booming and in the future, the vast majority of interactions will be made through virtual assistants. This, coupled with the difficulty encountered when planning a trip together with tourist information websites that act as conventional guides in an online format, do not help to generate a travel plan adapted to the time and interests that the tourist has, they provide ideal terrain for the continuation of the project.
Finally, it should be noted that due to the potential that our team has seen in this idea, the objective is to continue with the project in order to build a feasible, useful and innovative solution to be able to meet the needs that tourist users do not currently have.
To be continued…
Originally published at https://medium.com on December 17, 2019.