Neural Guide: Difference between revisions

Latest revision as of 15:19, 14 November 2016

Client: Jan Kis, IMC <Jan.Kis@imc.com>

The NeuralTalk Model Zoo provides pre-trained deep neural net models that can be used to generate text descriptions of unseen images. In principle, such sentence generation could be used to assist blind people, allowing them to point their mobile phone at a scene, upload the camera image to a server, and receive the predicted text as synthesised speech. The results are likely to be far less reliable than for images from the NeuralTalk demo database, so you will probably have to provide audio or tactile feedback on image quality, prediction confidence, and guidance to help the user point the phone in a more productive direction.

Revision as of 07:43, 11 November 2016 (view source) afb21 (talk \| contribs) No edit summary ← Older edit		Latest revision as of 15:19, 14 November 2016 (view source) afb21 (talk \| contribs) No edit summary
(2 intermediate revisions by the same user not shown)
Line 1:		Line 1:
	Client: ~~TO BE CONFIRMED~~ [[IMC]]		Client: Jan Kis, [[IMC]] <Jan.Kis@imc.com>

	The NeuralTalk Model Zoo provides pre-trained deep neural net models that can be used to generate text descriptions of unseen images. In principle, such sentence generation could be used to assist blind people, allowing them to point their mobile phone at a scene, upload the camera image to a server, and receive the predicted text as synthesised speech. The results are likely to be far less reliable than for images from the NeuralTalk demo database, so you will probably have to provide audio or tactile feedback on image quality, prediction confidence, and guidance to help the user point the phone in a more productive direction.		The NeuralTalk Model Zoo provides pre-trained deep neural net models that can be used to generate text descriptions of unseen images. In principle, such sentence generation could be used to assist blind people, allowing them to point their mobile phone at a scene, upload the camera image to a server, and receive the predicted text as synthesised speech. The results are likely to be far less reliable than for images from the NeuralTalk demo database, so you will probably have to provide audio or tactile feedback on image quality, prediction confidence, and guidance to help the user point the phone in a more productive direction.

Neural Guide: Difference between revisions

Latest revision as of 15:19, 14 November 2016

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools