Traps in sampling negative events

Solutions
HEALTHCARE

Patient safety comes first in our AI systems – we value reliability and interpretability.

EMBRYOAID

Acorai

Check more

E-COMMERCE

Introduce AI power to decode your customer behaviour for retail and marketing.

Showroom

Gemius

Check more

SECURITY

Cyber attacks are more clever – but AI can protect your value, so you can sleep well.

Polish Police

Smart Kid

Check more

OTHER

We are not afraid of other sectors – we have successful projects, e.g. in the automotive industry.

GIVT

VOSS

Check more
Services
Consulting

Menu
Customer stories
All Stories

Healthcare

E-Commerce

Security

Other

Menu
News
Blog
About Us
About Us

Our team

Join Our Team

Research

Contact us

Menu

Solutions
HEALTHCARE

Patient safety comes first in our AI systems – we value reliability and interpretability.

EMBRYOAID

Acorai

Check more

E-COMMERCE

Introduce AI power to decode your customer behaviour for retail and marketing.

Showroom

Gemius

Check more

SECURITY

Cyber attacks are more clever – but AI can protect your value, so you can sleep well.

Polish Police

Smart Kid

Check more

OTHER

We are not afraid of other sectors – we have successful projects, e.g. in the automotive industry.

GIVT

VOSS

Check more
Services
Consulting

Menu
Customer stories
All Stories

Healthcare

E-Commerce

Security

Other

Menu
News
Blog
About Us
About Us

Our team

Join Our Team

Research

Contact us

Menu

Blog

Date of publication: 3 years ago

Traps in sampling negative events

When obtaining information from our clients, we often receive access to data consisting only of positive events, e.g. a list of items purchased by each user or clicked ads.

Many machine learning models need not only positive but also negative events to be able to correctly estimate the probability of a positive event. These could be items not bought by a user during his visit in the store (despite having a chance to buy them) or ads that the user saw but did not click on. In some projects, there are so many negative events that processing all of them is too time-consuming. In such situations, we use negative event sampling, i.e. selecting a random subset of all potentially available negative events.

In this strategy of building a training set, you have to watch out for several traps:

It is important avoid selecting a negative event with an identical positive event
You have to draw from the full set of available negative events, but avoid, for example, contradictory data to be added to the training set, e.g. the purchase of a product that is unavailable on a given day or a purchase from a brick-and-mortar store that was closed that day.
When distinguishing good product recommendations from average product recommendations, you should include good and average recommendations in the training set in the randomly selected events, not good and bad ones. We used this strategy on the occasion of the Recsys 2016 competition https://arxiv.org/pdf/1612.00959.pdf .

If model predictions are used as accurate probability estimates, for example, to calculate expected revenue from an ad impression, the model predictions need to be recalibrated. We do this exactly like the Facebook team in section 6.3 of the publication.

Breaking news from MIM Solutions

Blog

What is a benchmark and why do you need it?

Author: Adam G. Dobrakowski Redaction: Zuzanna Kwiatkowska In Machine Learning, benchmark is a type of model used to compare performance of other models. There

المدونة

Data Scientist und Entwickler

Adam Witkowski, Data Scientist und Entwickler bei MIM Solutions, nahm an der Data Science Summit 2020 teil, die am 16. Oktober begonnen hat. Er präsentierte

المدونة

Interview für Artificial Intelligence

Piotr Sankowski, CEO von MIM Solutions, im Interview für Artificial Intelligence, einem nicht-kommerziellen KI-Portal. Er sprach über die Zukunft der KI in Polen, unsere Position

Who are we?

MIM Solutions was launched as a spin-off of the University of Warsaw’s Algorithms Group, directed by prof. Piotr Sankowski. The company has brought together experts passionate about solving practical algorithmic problems efficiently, and this has finally evolved towards machine learning. Although MIM Solutions is not a part of the university now, we are still in tight cooperation.

Why choose us?

MIM Solutions specialises in difficult tasks. We are proficient in providing effective solutions, especially when standard methods have failed. However, for the most common problems we specialise in, we offer a set of generic services ready to swiftly deploy in any environment.

MIM Solutions is a company registered in the National Court Register kept by the District Court for the City of Warsaw, 13th Commercial Division of the National Court. Register. KRS: 0000581404, NIP: PL5213710082.

HEALTHCARE

E-COMMERCE

SECURITY

OTHER

HEALTHCARE

E-COMMERCE

SECURITY

OTHER

Blog

Traps in sampling negative events

Date of publication: 3 years ago

Share this:

Traps in sampling negative events

Other posts

Breaking news from MIM Solutions

Follow us

What is a benchmark and why do you need it?

Data Scientist und Entwickler

Interview für Artificial Intelligence

Who are we?

Why choose us?

Mati Projektuje

Copyright © 2024 MIM.ai. All rights reserved.

Privacy policy