bicloud

6月 232017
 
1. 准备工作
1.1 安装python
目前支持python3.6,直接下载anaconda3,
Anaconda 4.4.0 For Windows python3.6 64位
https://www.continuum.io/downloads
下载完后,直接安装,在配置环境变量环节,选择需要配置环境变量
1.2 下载 visual studio community 2017安装VC编译环境
https://www.visualstudio.com/zh-hans/, 下载 visual studio community 2017
安装的时候,在工作负载的tab页面,选择使用c++的桌面开发
环境变量设置
在PATH环境变量添加
D:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.10.25017\bin\HostX64\x64
D:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE
新建LIB环境变量
D:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.10.25017\lib\x64
新建INCLUDE环境变量
D:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.10.25017\include
1.3 cuda安装
下载cuda https://developer.nvidia.com/cuda-downloads
直接安装,自动配置cuda相应的环境变量,可以去环境变量配置页面看看CUDA_PATH的设置
下载cuDNN,https://developer.nvidia.com/cudnn
cudnn-8.0-windows10-x64-v5.1,解压文件,将文件夹放到cuda的安装文件夹下。拷贝cudnn bin下的cudnn64_5.dll至
cuda安装文件夹下的bin下面。
2. tensorflow-gpu安装
pip install --upgrade -I setuptools
pip install --upgrade tensorflow-gpu
3. 测试
import tensorflow as tf
a = tf.constant(2.0, dtype=tf.float32)
b = tf.constant(3.0, dtype=tf.float32)
sum = tf.add(a, b)
with tf.Session() as sess:
    print(sess.run([sum]))
运行日志:
D:\Anaconda3\python.exe E:/PycharmProject/deeplearning/demo.py
2017-06-19 11:18:46.312109: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-06-19 11:18:46.312532: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-19 11:18:46.312924: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-19 11:18:46.313309: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-19 11:18:46.313686: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-19 11:18:46.314053: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-19 11:18:46.314408: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-19 11:18:46.314767: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-06-19 11:18:48.199218: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:940] Found device 0 with properties:
name: GeForce 940MX
major: 5 minor: 0 memoryClockRate (GHz) 1.189
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.36GiB
2017-06-19 11:18:48.199659: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:961] DMA: 0
2017-06-19 11:18:48.199868: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:971] 0:   Y
2017-06-19 11:18:48.200090: I c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0)
[5.0]
Process finished with exit code 0

 
 Posted by at 9:47 下午
6月 162017
 
keras vgg16 vgg19 resnet inception模型对比
# -*- coding: utf-8 -*-
# @DATE    : 2017/6/14 14:15
# @Author  : 
# @File    : classify_image.py

import os

import tensorflow as tf

from keras.applications import ResNet50
from keras.applications import InceptionV3
from keras.applications import Xception # TensorFlow ONLY
from keras.applications import VGG16
from keras.applications import VGG19
from keras.applications import imagenet_utils
from keras.applications.inception_v3 import preprocess_input
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import load_img
import numpy as np


MODEL =  {
    "vgg16": VGG16,
    "vgg19": VGG19,
    "inception": InceptionV3,
    "xception": Xception,
    "resnet": ResNet50
}

def load_preprocess_image(model_name, image_path):
    # vgg resnet
    input_shape = (224, 224)
    preprocess_func = imagenet_utils.preprocess_input

    if model_name in ["inception", "xception"]:
        input_shape = (299, 299)
        preprocess_func = preprocess_input

    # load image
    image = load_img(image_path, target_size=input_shape)

    # preprocess image
    image = img_to_array(image)
    image = np.expand_dims(image, axis=0)
    image = preprocess_func(image)

    return image


def image_model(model_name):
    network = MODEL[model_name]
    network = network(weights="imagenet")
    return network

def classify_image(model, image):
    preds = model.predict(image)
    preds = imagenet_utils.decode_predictions(preds)
    return preds


if __name__ == "__main__":
    data_dir = "data"
    image_names = ["ball.jpg"]
    image_paths = [ os.path.join(data_dir, image_name)  for image_name in image_names]
    model_names = ["vgg16", "vgg19", "resnet", "inception", "xception"]
    for image_path in image_paths:
        print("Image: {}".format(os.path.split(image_path)[-1]))
        for model_name in model_names:
            image = load_preprocess_image(model_name, image_path)
            model = image_model(model_name)
            preds = classify_image(model, image)
            print("Model: {}".format(model_name))
            print("Results(Top 5): ")
            for (i, (imagenetID, label, prob)) in enumerate(preds[0]):
                print("{}. {}: {:.2f}%".format(i + 1, label, prob * 100))


运行日志:



Using TensorFlow backend.
Image: ball.jpg
2017-06-16 19:50:03.635390: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-16 19:50:03.635405: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-16 19:50:03.635408: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-16 19:50:03.635413: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Model: vgg16
Results(Top 5): 
1. soccer_ball: 93.73%
2. rugby_ball: 5.88%
3. volleyball: 0.13%
4. golf_ball: 0.11%
5. tennis_ball: 0.04%
Model: vgg19
Results(Top 5): 
1. soccer_ball: 98.07%
2. rugby_ball: 1.24%
3. golf_ball: 0.32%
4. volleyball: 0.20%
5. croquet_ball: 0.07%
Model: resnet
Results(Top 5): 
1. soccer_ball: 99.54%
2. rugby_ball: 0.39%
3. volleyball: 0.06%
4. running_shoe: 0.00%
5. football_helmet: 0.00%
Model: inception
Results(Top 5): 
1. soccer_ball: 99.88%
2. volleyball: 0.11%
3. rugby_ball: 0.00%
4. sea_urchin: 0.00%
5. silky_terrier: 0.00%
Model: xception
Results(Top 5): 
1. soccer_ball: 90.85%
2. volleyball: 2.48%
3. rugby_ball: 1.37%
4. balloon: 0.11%
5. airship: 0.10%

参考:http://www.pyimagesearch.com/2017/03/20/imagenet-vggnet-resnet-inception-xception-keras/






 
 Posted by at 7:47 下午
6月 162017
 
Chinese-Character-Recognition https://github.com/soloice/Chinese-Character-Recognition
ImageNet: VGGNet, ResNet, Inception, and Xception with Keras http://www.pyimagesearch.com/2017/03/20/imagenet-vggnet-resnet-inception-xception-keras/
tensorflow large scale input 处理 参考cifar10 代码 https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_input.py
A Survey on Deep Learning in Medical Image Analysis https://arxiv.org/pdf/1702.05747.pdf
http://www.jeyzhang.com/understanding-lstm-network.html
Awesome Deep learning papers and other resources https://github.com/endymecy/awesome-deeplearning-resources
Multi-Scale Convolutional Neural Networks for Time Series Classification https://arxiv.org/pdf/1603.06995v4.pdf
lstm时间序列预测 https://github.com/RobRomijnders/LSTM_tsc
CNN时间序列预测 http://robromijnders.github.io/CNN_tsc/

 
 Posted by at 7:45 下午
6月 122017
 
深度学习课程 http://www.samuelcheng.info/deeplearning_2017/
docker入门到实践 https://www.gitbook.com/book/yeasy/docker_practice/details
LONG SHORT-TERM MEMORY http://www.bioinf.jku.at/publications/older/2604.pdf
http://seanlook.com/tags/docker/
docker run -ti --volume=$(pwd):/workspace caffe:cpu bash  启动docker 映射工作目录  docker环境切换到工作环境目录,先ctrl p,然后ctrl q;
容器生命周期管理 — docker [run|start|stop|restart|kill|rm|pause|unpause]
容器操作运维 — docker [ps|inspect|top|attach|events|logs|wait|export|port]
容器rootfs命令 — docker [commit|cp|diff]
镜像仓库 — docker [login|pull|push|search]
本地镜像管理 — docker [images|rmi|tag|build|history|save|import]
其他命令 — docker [info|version]
https://huangying-zhan.github.io/ faster rcnn , fast rcnn, rcnn
Transferrable Representations for Visual Recognition https://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-106.pdf

https://yahooeng.tumblr.com/post/151148689421/open-sourcing-a-deep-learning-solution-for
1. macos 安装 docker,下载安装文件,https://store.docker.com/editions/community/docker-ce-desktop-mac 直接安装
2. 查看docker版本 docker --version
Docker version 17.03.1-ce, build c6d412e
3. 测试web服务
docker run -d -p 80:80 --name webserver nginx, 打开localhost网页
4. 下载nsfw代码,git clone https://github.com/yahoo/open_nsfw
5. 进入nsfw代码目录,cd open_nsfw/
6. 下载caffe的docker文件 wget https://github.com/BVLC/caffe/raw/master/docker/cpu/Dockerfile
7. 编译安装caffe镜像,docker build -t caffe:cpu ./
8. 启动docker
docker run -ti caffe:cpu caffe --version
9. 映射工作目录
docker run -ti --volume=$(pwd):/workspace caffe:cpu bash
10. 测试黄图识别
wget http://image.tianjimedia.com/uploadImages/2015/288/26/R99Q7A2345V5.jpg
mv R99Q7A2345V5.jpg test3.jpg
python ./classify_nsfw.py \
--model_def nsfw_model/deploy.prototxt \
--pretrained_model nsfw_model/resnet_50_1by2_nsfw.caffemodel \
test3.jpg
运行日志
I0605 12:36:59.237032    11 upgrade_proto.cpp:77] Attempting to upgrade batch norm layers using deprecated params: nsfw_model/resnet_50_1by2_nsfw.caffemodel
I0605 12:36:59.237094    11 upgrade_proto.cpp:80] Successfully upgraded batch norm layers using deprecated params.
I0605 12:36:59.242766    11 net.cpp:744] Ignoring source layer loss
NSFW score:   0.970513343811

https://github.com/alex-paterson/Barebones-Flask-and-Caffe-Classifier  


 
 Posted by at 6:05 下午
6月 032017
 

The CIFAR-10 dataset http://www.cs.toronto.edu/~kriz/cifar.html
http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html
kaggle past solutions http://ndres.me/kaggle-past-solutions/
互联网趋势报告2017 http://www.kpcb.com/internet-trends
3 Million Instacart Orders, Open Sourced https://www.instacart.com/datasets/grocery-shopping-2017 

 
 Posted by at 8:29 下午
5月 282017
 
TensorFlow for Machine Intelligence https://github.com/backstopmedia/tensorflowbook
Picasso: A free open-source visualizer for Convolutional Neural Networks https://github.com/merantix/picasso
Network Dissection: Quantifying Interpretability of Deep Visual Representations http://netdissect.csail.mit.edu/
Google's TensorFlow numerical computation and machine learning library https://github.com/memo/ofxMSATensorFlow
tensorflow cnn http://upflow.co/l/msqr/2017/04/01/image-classification-using-convolutional-neural-networks-in-tensorflow
LSTM by Example using Tensorflow https://medium.com/towards-data-science/lstm-by-example-using-tensorflow-feb0c1968537
stanford imagenet dog dataset http://vision.stanford.edu/aditya86/ImageNetDogs/
GPU-accelerated Keras with Tensorflow or Theano on Windows 10 native https://github.com/philferriere/dlwin
Convolutional Neural Networks for Visual Recognition http://cs231n.stanford.edu/

 
 Posted by at 10:04 下午
5月 192017
 
2017 data science bowl 2nd https://github.com/dhammack/DSB2017/ https://github.com/juliandewit/kaggle_ndsb2017/
增强学习介绍 https://lufficc.com/blog/reinforcement-learning-and-implementation https://github.com/lufficc/dqn
Demystifying Deep Reinforcement Learning https://www.nervanasys.com/demystifying-deep-reinforcement-learning/
Santander Product Recommendation https://github.com/ttvand/Santander-Product-Recommendation
https://github.com/aaron-xichen/pytorch-playground Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)
伯克利增强学习课程 http://rll.berkeley.edu/deeprlcourse/
DQN 从入门到放弃1 DQN与增强学习 https://zhuanlan.zhihu.com/p/21262246
alphago介绍 http://www0.cs.ucl.ac.uk/staff/D.Silver/web/Resources_files/AlphaGo_IJCAI.pdf
geohash https://en.wikipedia.org/wiki/Geohash http://blog.csdn.net/zmx729618/article/details/53068170 http://www.cnblogs.com/dengxinglin/archive/2012/12/14/2817761.html
    com.spatial4j
    spatial4j
    0.5
    ch.hsr
    geohash
    1.3.0
http://blog.csdn.net/ghsau/article/details/50591932
deepmind papers https://deepmind.com/research/publications/
How we built Tagger News: machine learning on a tight schedule http://varianceexplained.org/programming/tagger-news/  https://github.com/dodger487
mnist data csv format https://pjreddie.com/projects/mnist-in-csv/
deep chatbots https://github.com/mckinziebrandon/DeepChatModels

 
 Posted by at 11:36 下午
5月 122017
 
最近实际项目需要构建复杂网络,这块一直没有实践,之前主要是看看paper,尤其是大数据下的图计算模型。基于hadoop的图计算框架giraph(facebook实践),通过实践对pregel的理解更加深入,实现热传导算法等等。
hadoop graph框架学习和实践   giraph http://giraph.apache.org/ , http://grafos.ml/  http://arabesque.io/
Arabesque: A System for Distributed Graph Mining http://sigops.org/sosp/sosp15/current/2015-Monterey/printable/093-teixeira.pdf
tensorflow https://github.com/skcript/tensorflow-resources
spark https://github.com/endymecy/spark-ml-source-analysis
spark 关闭运行日志 http://stackoverflow.com/questions/25193488/how-to-turn-off-info-logging-in-pyspark
Architecture of Giants: Data Stacks at Facebook, Netflix, Airbnb, and Pinterest https://blog.keen.io/architecture-of-giants-data-stacks-at-facebook-netflix-airbnb-and-pinterest-9b7cd881af54?imm_mid=0f1550
TensorFlow template application for deep learning https://github.com/tobegit3hub/deep_recommend_system
Top 20 Recent Research Papers on Machine Learning and Deep Learning http://www.kdnuggets.com/2017/04/top-20-papers-machine-learning.html
jblas http://jblas.org/
spark 机器学习 https://book.douban.com/subject/26350074/
machine learning dataset http://persoal.citius.usc.es/manuel.fernandez.delgado/papers/jmlr/data.tar.gz
Do we Need Hundreds of Classifiers to Solve Real World Classification Problems? http://jmlr.org/papers/volume15/delgado14a/delgado14a.pdf
CTR predict
Y. W. Chang, C. J. Hsieh, K. W. Chang, M. Ringgaard, and C.-J. Lin, “Training and testing low- degree polynomial data mappings via linear SVM,” Journal of Machine Learning Research, vol. 11, pp. 1471–1490, 2010.
T. Kudo and Y. Matsumoto, “Fast methods for kernel-based text analysis,” in Proceedings of the 41st Annual Meeting of the Association of Computational Linguistics (ACL), 2003
S. Rendle, “Factorization machines,” in Proceedings of IEEE International Conference on Data Mining (ICDM), pp. 995–1000, 2010.
B. Mcmahan, G. Holt, D. Scully , “Ad Click Prediction: a View from the Trenches”
J. Pan, O. Jin, T. Xu, “Practical Lessons from Predicting Clicks on Ads at Facebook”
Y. Juan, Y. Xhuang, W, Chin, “Field-aware Factorization Machines for CTR Prediction”
G. James, D. Witten, T. Hastie, R. Tibshirani, “An Introduction to Statistical Learning”, 2013.
Neural Models for Information Retrieval https://arxiv.org/pdf/1705.01509.pdf
https://kowshik.github.io/JPregel/pregel_paper.pdf Pregel: A System for Large-Scale Graph Processing
Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction https://arxiv.org/abs/1704.05194
High Performance Linear Algebra OOP https://github.com/fommil/matrix-toolkits-java
抓取京东评论数据 https://github.com/awolfly9 

 
 Posted by at 8:32 下午
5月 052017
 
learning deep learning with keras http://p.migdal.pl/2017/04/30/teaching-deep-learning.html
移动视频2017年用户画像和趋势预测 http://mp.weixin.qq.com/s?src=3&timestamp=1493786539&ver=1&signature=LmxQAN5pURJyKafpyA7pOMD85zwzyCuxMHTKCKqXVC7D-a*-DWOtWapMbH0LA6VWonKHQy1pAp*EX0bRu5lpiTELozDJCDoTiZL4*5LVI2sMLN5DECkwAslE1rtyOmOs8zc8opBzTAfPs7sN9uqgLVhn20e-HC1lJG7SBSj*gmQ=
https://chrisalbon.com/ Notes on Data Science, Machine Learning, & Artificial Intelligence
TensorFlow template application for deep learning https://github.com/tobegit3hub/deep_recommend_system.git
Python + Scrapy + MongoDB . 5 million data per day !!!💥 The world's largest website. 🔞  https://github.com/xiyouMc/WebHubBot
QuestionAnsweringSystem是一个Java实现的人机问答系统,能够自动分析问题并给出候选答案 https://github.com/ysc/QuestionAnsweringSystem
python 新闻联播 https://github.com/maxiee/MyCodes/blob/master/PythonJiebaProjects/XWLB_words_freq/xwlb_jieba.py
2nd place solution for the 2017 national datascience bowl http://juliandewit.github.io/kaggle-ndsb2017/
feature hash https://github.com/wush978/FeatureHashing
A framework for training and evaluating AI models on a variety of openly available dialog datasets https://github.com/facebookresearch/ParlAI


 
 Posted by at 10:25 下午