使用python和tensorflow从Image中识别数字

详细信息：Ubuntu 14.04(LTS),OpenCV 2.4.13,Spyder 2.3.9(Python 2.7),Tensorflow r0.10

我想认识来自
the image使用Python和Tensorflow(可选OpenCV).

另外,我想使用具有张量流的MNIST数据训练

像这样(代码参考this page的视频),

码：

import tensorflow as tf
import random

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/",one_hot=True)

x = tf.placeholder("float",[None,784])
y = tf.placeholder("float",10])

W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))

learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

### modeling ###

activation = tf.nn.softmax(tf.matmul(x,W) + b)

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(activation),reduction_indices=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

### training ###

for epoch in range(training_epochs) :

    avg_cost = 0
    total_batch = int(mnist.train.num_examples/batch_size)

    for i in range(total_batch) :

        batch_xs,batch_ys =mnist.train.next_batch(batch_size)
        sess.run(optimizer,Feed_dict={x: batch_xs,y: batch_ys})
        avg_cost += sess.run(cross_entropy,Feed_dict = {x: batch_xs,y: batch_ys}) / total_batch

    if epoch % display_step == 0 :
        print "Epoch : ","%04d" % (epoch+1),"cost=","{:.9f}".format(avg_cost)

print "Optimization Finished"

### predict number ###

r = random.randint(0,mnist.test.num_examples - 1)
print "Prediction: ",sess.run(tf.argmax(activation,1),{x: mnist.test.images[r:r+1]})
print "Correct Answer: ",sess.run(tf.argmax(mnist.test.labels[r:r+1],1))

但是,问题是如何使numpy数组像

代码添加：

mnist.test.images[r:r+1]

[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 1. 1. 0.25098041 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0.74901962 0. 0.50196081 1. 0. 0. 0. 0. 0.50196081 0. 0. 0. 0.25098041 0. 0. 0. 0.74901962 0. 0. 0. 0.74901962 0. 0. 0. 0.74901962 0. 0. 0. 0.25098041 1. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 1. 0. 0. 0.25098041 1. 0. 0. 0.74901962 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 0.74901962 0. 0. 0. 0.74901962 0. 0.25098041 1. 0. 0. 0.74901962 1. 0.50196081 1. 1. 0. 0. 0. 0. 1. 1. 0.50196081 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0.50196081 0.50196081
0.50196081 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0.50196081 1. 1. 1.
1. 0.50196081 0.25098041 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0.50196081 1. 1. 1. 1.
1. 1. 1. 1.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0.74901962 1. 1. 1.
0.50196081 0.50196081 0.50196081 0.74901962 1. 1.
0.74901962 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0.50196081 1. 1. 1.
0. 0. 0. 0. 0.
1. 0.74901962 0. 0. 0.
0. 0. 0. 0. 0.
0. 1. 1. 1.
0. 0. 0. 0. 0.
0.25098041 1. 1. 0.74901962
0. 0. 0. 0. 0.
0. 0. 0.74901962 1. 1.
0. 0. 0. 0. 0.
0. 0. 0.25098041 1. 1.
0. 0. 0. 0. 0.
0. 0.50196081 1. 1.
0. 0. 0. 0. 0.
0. 0. 0. 0.
1. 0.50196081 0. 0. 0.
0. 0. 0. 0.50196081
0.25098041 0. 0. 0. 0.
0. 0. 0. 0. 0.
1. 1. 0.50196081 0. 0.
0. 0. 0. 0. 1.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0.25098041 1. 1. 1. 0. 0.
0. 0. 0. 0. 1.
0.50196081 0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
1. 1. 1. 0. 0.
0. 0. 0. 0.
0.50196081 0. 0. 0. 0.
0. 0. 0. 0. 0.
0.74901962 1. 1. 1. 0.25098041
0. 0. 0. 0. 0.
0.50196081 1. 1. 0. 0. 0.
0. 0. 0. 0.
0.74901962 1. 1. 1. 1.
0. 0. 0. 0. 0.
0. 0.50196081 1. 1.
0. 0. 0. 0.
0.50196081 1. 1. 1. 1. 1.
0.50196081 0. 0. 0. 0. 0.
0. 0. 0. 0.
1. 1. 1. 0.50196081
0.74901962 1. 1. 1. 1. 1.
0.50196081 0. 0. 0. 0.
0. 0. 0. 0. 0.
0.74901962 1. 1. 1. 1.
1. 1. 1. 1. 1.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0.25098041 1. 1. 1.
1. 1. 0.50196081 0.25098041
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0.
0.50196081 0.50196081 0.50196081 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. 0. 0. 0.
0. 0. ]]

当我使用OpenCV解决问题时,我可以制作关于图像的numpy数组,但有点奇怪.
(我想把数组变成28×28的向量)

代码添加：

image = cv2.imread("img_easy.jpg")
resized_image = cv2.resize(image,(28,28))

[[[255 255 255] [255 255 255] [255 255 255] …,[255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] …,[255 255 255] [255 255 255] [255 255 255]]

…,

[[255 255 255] [255 255 255] [255 255 255] …,[255 255 255] [255 255 255] [255 255 255]]]

然后,我将值(‘resized_image’)放入Tensorflow代码中.
像这样,

代码修改：

### predict number ###

print "Prediction: ",{x: resized_image})
print "Correct Answer: 9"

结果,在该行发生错误.

ValueError: Cannot Feed value of shape (28,28,3) for Tensor u’Placeholder_2:0′,which has shape ‘(?,784)’

最后,

1)我想知道如何制作可以输入tensorflow代码的数据(也许是numpy数组[784])

2)您是否了解使用tensorflow的数字识别示例？

我是机器学习的初学者.

请详细告诉我该怎么做.

最佳答案

看起来你正在使用的图像是RGB,因此是第三维(28,3).

其中原始MNIST图像是灰度,宽度和高度为28.这就是为什么x占位符的形状为[None,784],因为28 * 28 = 784.

CV2以RGB格式读取图像,你希望它是灰度级的,即(28,28)
在做你的imread时你会发现使用它很有帮助.

image = cv2.imread("img_easy.jpg",cv2.CV_LOAD_IMAGE_GRAYSCALE)

通过这样做,您的图像应该具有正确的形状(28,28).

此外,CV2图像值与您的问题中显示的MNIST图像的范围不同.您可能必须规范化图像中的值,使它们在0-1范围内.

此外,你可能想要使用CNN(略高一些,但应该给出更好的结果).有关详细信息,请参阅本页https://www.tensorflow.org/tutorials/上的教程.

使用python和tensorflow从Image中识别数字

猜你在找的Python相关文章