Thursday, December 26, 2013

Logistic Regression in Machine Learning

Before studying logistic regression, I would recommend you to go through these tutorials.
The first and most important thing about logistic regression is that it is not a "Regression" but a "Classification" algorithm. The name itself is somewhat misleading. Regression gives a continuous numeric output but most of the time we need the output in classes (i.e. categorical, discrete). For example, we want to classify emails into "spam" or "not spam",  classify treatment into "success" or "failure", classify statement into "right" or "wrong" , classify transactions into "fraudulent" or "non-fraudulent" and so on. These are the examples of  logistic regression having binary output (also called dichotomous). Note that the output may not always be binary but in this article I merely talk about  binary output.

Saturday, December 21, 2013

Nepal over last Four Decades with R

The World Bank DataBank provides the data of all countries for more than 1000 indicators since 1960. It can be used for various statistical analysis purpose. This weekend, I downloaded the Data for Nepal, and tried few simple things in R, mainly for the purpose of learning R. R is very powerful statistical analysis tool.

The data was downloaded as csv file, which can be read with read.csv() function in R. We can run the R commands from R command line, and its really interactive.

Gradient descent versus normal equation

Gradient descent and normal equation (also called batch processing) both are methods for finding out the local minimum of a function. I have given some intuition about gradient descent in previous article. Gradient descent is actually an iterative method to find out the parameters. We start with the initial guess of the parameters (usually zeros but not necessarily), and gradually adjust those parameters so that we get the function that best fit the given data points. Here is the mathematical formulation of gradient descent in the case of linear regression.

Tuesday, December 10, 2013

Numerical Methods Tutorials

This section consists of various numerical methods problems and their solution in C language. You can click each link to view the source code of corresponding problem in C.

  1. Solution of Differential Equation using RK4 method
  2. Solution of Non-linear equation by Bisection Method
  3. Solution of Non-linear equation by Newton Raphson Method
  4. Solution of Non-linear equation by Secant Method
  5. Interpolation with unequal method by Lagrange's Method
  6. Linear Curve Fitting
  7. Parabolic Curve Fitting
  8. Gauss Jordan Method
  9. Determinant of a NxN Matrix
  10. Inverse of a NxN Matrix
  11. Integration using Trapezoidal Rule
  12. Integration using Simpson's 3/8 Rule
  13. Integration using Simpson's 1/3 Rule
  14. Greatest Eigen value and Eigen vector using Power Method
  15. Condition number and ill condition checking 
  16. Newton's Forward and Backward interpolation
  17. 2 Dimensional matrix multiplication 
Note: All the codes are compiled in GCC Mingw compiler in windows. Attempting to compile in other compiler and platform may result errors. These tutorials are targeted for student and for learning purpose, so the code may not be optimized for the actual implementation. Some operations like matrix inversion and determinant are done without pivoting, divide by zero error may result in some cases. Partial and full pivoting are recommended.