Fusepump: Difference between revisions

From Computer Laboratory Group Design Projects
Jump to navigationJump to search
No edit summary
No edit summary
Line 4: Line 4:


The aim of this project will be to develop an ecommerce-optimised machine learning tool that takes an unmapped category (and potentially other product information) and outputs a mapping and a confidence level. A user interface will need to be created to allow human users to view a list of mappings by confidence level, make manual corrections to mappings, add to the training set, and view the progress of the mapping.
The aim of this project will be to develop an ecommerce-optimised machine learning tool that takes an unmapped category (and potentially other product information) and outputs a mapping and a confidence level. A user interface will need to be created to allow human users to view a list of mappings by confidence level, make manual corrections to mappings, add to the training set, and view the progress of the mapping.
Thanks for this - it looks good. I think it might still be useful to give an example of a typical set of retailer categories and an example of a "different taxonomy"
How about a mapping of eBay or Kelkoo categories to a site like this: https://www.districtlines.com/





Revision as of 14:18, 17 September 2014

Contact: Lauren Dawe (lauren.dawe@fusepump.com)

Ecommerce retailers face a seemingly insurmountable barrier to being able to fully automate their online marketing activity. Many marketing channels, such as price comparison sites Google Shopping and Kelkoo, or marketplaces eBay and Amazon, require formatted product data, but each has its own format. Most retailers struggle to adapt their product data to these many distinct formats. This problem manifests itself mostly in the process of product category mapping: that is, the mapping of a retailer's list of product categories onto a different category taxonomy. This is an issue that has caused problems for some of the world's largest ecommerce companies, despite the fact classification has been the subject of machine learning research for decades.

The aim of this project will be to develop an ecommerce-optimised machine learning tool that takes an unmapped category (and potentially other product information) and outputs a mapping and a confidence level. A user interface will need to be created to allow human users to view a list of mappings by confidence level, make manual corrections to mappings, add to the training set, and view the progress of the mapping.


Thanks for this - it looks good. I think it might still be useful to give an example of a typical set of retailer categories and an example of a "different taxonomy"

How about a mapping of eBay or Kelkoo categories to a site like this: https://www.districtlines.com/


Original Suggestion:

"Ecommerce retailers face a serious barrier to being able to fully automate their marketing activity. Many marketing channels require formatted product data, but crucially each has its own format which most retailers do not have the ability to support. This problem manifests itself mostly in product category mapping, an issue which has caused problems for some of the world's largest companies.

You will need to make a machine learning tool that uses rules to take an unmapped category, and output a mapping based on a confidence level that can be used in marketing channels. Alongside this, a user interface will need to be created to allow human users to make manual interventions to mappings, show all mappings by confidence levels and allow additional information to be added to the training set. "

Response:

I think this would be a good basis for an interesting project. We may need to add some explanation, perhaps just by using examples, so that the computer science students who participate in this course understand what is meant by "marketing channel" and "product category mapping." We don't teach e-commerce as a separate subject, so they may not have come across the jargon, but I expect will be familiar with these concepts as applied in some familiar websites. It might be a good idea to choose sites that would have some natural appeal to students - perhaps in popular and youth culture.

One technical challenge in machine learning projects is gaining access to a training dataset sufficiently large for the project results to be comparable to current commercial offerings. Where a client is able to provide access to an interesting data set, this can result in an exciting project - for example, a prize-winning group a couple of years ago were given access to live data from Last.fm