A Preliminary Study of Machine Learning
Gradient
1 |
|
A Preliminary Study of Machine Learning
1 |
|
Introduction to Artificial Intelligence
The code address of this article is: Example 01
The source code is in ipynb format, and the output content can be viewed.
1 |
|
1 |
|
Logistic regression to diagnose heart disease
The preject source code url : Heart
1 |
|
1 |
|
1 |
|
1 |
|
Foundation of Artificial Intelligence - Lecture 1
No obvious solution ==> Algorithm engineers do it If there is a clear implementation path ==> the person who develops the project will do it
{Ace of hearts, 10 of spades, 3 of spades, 9 of hearts, 9 clubs, 4 of diamonds, J}
First: Hearts> Diamonds> Spades> Clubs Second: Numbers are arranged from small to large
\[ 1024 --> 10^3 --> 1k \] \[ 1024 * 1024 --> 10^6 --> 1M \] \[ 1024 * 1024 * 1024 --> 10^9 --> 1G \]
1 |
|
2.6G Hz
1 |
|
1 |
|
1 |
|
1 |
|
\[ Time(N) - Time(N-1) = constant \] \[ Time(N-1) - Time(N-2) = constant \] \[ Time(N-2) - Time(N-3) = constant \] \[ Time(2) - Time(1) = constant \] \[ Time(N) - Time(1) == (N-1)constant \] \[ Time(N) == (N-1)constant + Time(1) \] \[ Time(N) == N * constant + (Time(1) - constant) \]
SVM-based Text Classification in Practice
The source code: SVM-based Text Classification in Practice
'cnews.train.txt' data cannot be uploaded because it is too large, so it needs to be decompressed and imported after compression.
Use SVM to implement a simple text classification based on bag of words and support vector machine.
1 |
|
Chinese news data is prepared as a sample data set. The number of training data is 50,000 and the number of test data is 10,000. All data is divided into 10 categories: sports, finance, real estate, home furnishing, education, technology, fashion, current affairs, games and entertainment . From the training text, you can load the code, view the data format and samples:
1 |
|
Take the first item of the training data as an example to segment the loaded news data. Here I use the word segmentation function of LTP, you can also use jieba, and the segmentation results are displayed separated by "/" symbols.
1 |
|
To sort out the above logic a bit, implement a class to load training and test data and perform word segmentation.
1 |
|
After spending some time on word segmentation, you can start building a dictionary. The dictionary is built from the training set and sorted by word frequency.
1 |
|
In addition, according to category, we know that the label itself also has a "dictionary":
1 |
|
Next, construct the id-based training and test sets, because we only consider the bag of words, so the order of words is excluded. Constructed to look like libsvm can eat. Note that because the bag of word model
1 |
|
The remaining core model is simple: use libsvm to train the support vector machine, let your svm eat the training and test files you have processed, and then use the existing method of libsvm to train, we can change different parameter settings . The documentation of libsvm can be viewed here, where the "-s, -t, -c" parameters are more important, and they decide what you choose Svm, your choice of kernel function, and your penalty coefficient.
1 |
|
After a period of training, we can observe the experimental results. You can change different svm types, penalty coefficients, and kernel functions to optimize the results.
The code address of this article is: auto operation weibo Chromedrive download: Taobao Mirror , need to be consistent with your Chrome version
1 |
|
1 |
|
1 |
|
The source code: Boston House
1 |
|
1 |
|
1 |
|
1 |
|
In order to obtain results faster, we hope to obtain predictive power by fitting a function
\[ f(rm) = k * rm + b \]
\[ Loss(k, b) = \frac{1}{n} \sum_{i \in N} (\hat{y_i} - y_i) ^ 2 \] \[ Loss(k, b) = \frac{1}{n} \sum_{i \in N} ((k * rm_i + b) - y_i) ^ 2 \]
1 |
|
\[ Loss(k, b) = \frac{1}{n} \sum_{i \in N} ((k * rm_i + b) - y_i) ^ 2 \]
\[ \frac{\partial{loss(k, b)}}{\partial{k}} = \frac{2}{n}\sum_{i \in N}(k * rm_i + b - y_i) * rm_i \]
\[ \frac{\partial{loss(k, b)}}{\partial{b}} = \frac{2}{n}\sum_{i \in N}(k * rm_i + b - y_i)\]
1 |
|
We turn the forecast of housing prices into a more responsible and sophisticated model. What should we do?
\[ f(x) = k * x + b \]
\[ f(x) = k2 * \sigma(k_1 * x + b_1) + b2 \]
\[ \sigma(x) = \frac{1}{1 + e^(-x)} \]
1 |
|
We can implement more complex functions through simple, basic modules and repeated superposition
For more and more complex functions? How does the computer seek guidance?
\[ L2-Loss(y, \hat{y}) = \frac{1}{n}\sum{(\hat{y} - y)}^2 \]
\[ L1-Loss(y, \hat{y}) = \frac{1}{n}\sum{|(\hat{y} - y)|} \]
L2-Loss becomes L1Loss and achieves gradient descent
Realize L1Loss gradient descent from 0
1 |
|
1 |
|
Normalization or standardization can prevent a certain dimension or a few dimensions from affecting the data too much when there are very many dimensions, and secondly, the program can run faster. There are many methods, such as standardization, min-max, z-score, p-norm, etc. How to use it depends on the characteristics of the data set.
Further reading-数据标准化的迷思之深度学习领域
1 |
|
Divide the data set, where 20% of the data is used as the test set X_test, y_test, and the other 80% are used as the training set X_train, y_train, where random_state is the random seed
1 |
|
1 |
|
Boston house price CART regression tree
1 |
|
!> Before running this code, please ensure that the relevant dependencies have been installed;
The code address of this article is: digit recognition
1 |
|
output
1 |
|
1 |
|
1 |
|
1 |
|
The code address of this article is: example_01_Assignment