
Data Scientist und Entwickler
Adam Witkowski, Data Scientist und Entwickler bei MIM Solutions, nahm an der Data Science Summit 2020 teil, die am 16. Oktober begonnen hat. Er präsentierte
Adam Witkowski, Data Scientist und Entwickler bei MIM Solutions, nahm an der Data Science Summit 2020 teil, die am 16. Oktober begonnen hat. Er präsentierte
The “Top Disruptors in Healthcare” report has just been issued – with MIM Solutions included as one of the most innovative Polish startups! During the
MIM Solutions and the Swedish start-up AcoraiAB are starting a project aimed at better predicting the prognosis of cardiac patients. As part of the project,
MIM Solutions was launched as a spin-off of the University of Warsaw’s Algorithms Group, directed by prof. Piotr Sankowski. The company has brought together experts passionate about solving practical algorithmic problems efficiently, and this has finally evolved towards machine learning. Although MIM Solutions is not a part of the university now, we are still in tight cooperation.
MIM Solutions specialises in difficult tasks. We are proficient in providing effective solutions, especially when standard methods have failed. However, for the most common problems we specialise in, we offer a set of generic services ready to swiftly deploy in any environment.
MIM Solutions is a company registered in the National Court Register kept by the District Court for the City of Warsaw, 13th Commercial Division of the National Court. Register. KRS: 0000581404, NIP: PL5213710082.
When obtaining information from our clients, we often receive access to data consisting only of positive events, e.g., a list of items purchased by each user or clicked ads.
Many machinelearning models need not only positive but also negative events to be able to estimate the probability of a positive event correctly. These could be items not bought by a user during his visit to the store (despite having a chance to buy them) or ads that the user saw but did not click on. In some projects, there are so many negative events that processing all of them is too time-consuming. In such situations, we use negative event sampling, i.e., selecting a random subset of all potentially available negative events.
In this strategy of building a training set, you have to watch out for several traps:
• It is essential to avoid selecting a negative event with an identical positive event.
• You have to draw from the complete set of available negative events but avoid, for example, contradictory data to be added to the training set, e.g., the purchase of a product that is unavailable on a given day or a purchase from a brick-and-mortar store that was closed that day.
• When distinguishing good product recommendations from average product recommendations, you should include good and average recommendations in the training set in the randomly selected events, not good and bad ones. We used this strategy on the occasion of the Recsys 2016 competition: https://lnkd.in/dgUb-FzC
If model predictions are used as accurate probability estimates, for example, to calculate expected revenue from an ad impression, the model predictions need to be recalibrated.