Posts by kristianjb

kristianjb

Jan 6, 2019

Graduate / Data Mining - KGSP for Graduate Studies [2]

Hello, I am applying for the 2019 Korean Government Scholarship Program (KGSP) for Graduate Studies and this is my Goal of Study & Study Plan. I hope someone would take the time to help me fix a few grammatical errors, and if there's a need to edit the content and structure, or something I could add to strengthen my case.

Goal of study & Study Plan

Goal of study, title or subject of research, and detailed study plan

The reason I have chosen to study a master's degree in industrial and management engineering is to progress my career towards the industrial application of machine learning. I am hoping to further strengthen my foundation in mathematics and research skills relevant to data science. At the end of the program, I envision myself to be more capable of promoting a data-driven culture in my local community by combining engineering knowledge with business administration in order to improve productivity. In the long run, I want to be able to build a solid reputation for consultancy and help organizations achieve optimization and continuous improvement through the effective utilization of data.

Data mining is the subject of research I am interested in and has long fascinated me since my senior year in the university. I developed a keen interest in data mining particularly on clustering algorithms when I was working on my undergraduate thesis. Clustering is an unsupervised machine learning algorithm that has the goal of decomposing a given set of objects based on similarity or dissimilarity - to model the underlying structure or data distribution in the data in order to learn more about the data. There are numerous ways of categorizing clustering algorithms - partitional, hierarchical, distance-based, or density-based - according to result or according to optimization. According to optimization, a density-based clustering algorithm can be mathematically modeled as a multimodal optimization problem, which can come off as NP-hard optimization problem.

Metaheuristic algorithms are effective in addressing the computational complexity brought about by NP-hard optimization problems because they find a near-optimal solution in the search space that takes up the least amount of time. The two main features of metaheuristic algorithms are exploration and exploitation. Exploration enables the algorithm to escape local optimum traps in finding solutions in the search space while exploitation carefully examines a promising region within the search space to find the best optimum. A metaheuristic algorithm is said to be effective if there is a balance between these two features.

Clustering and metaheuristics are significant in machine learning. Since algorithms need to be trained to learn how to classify and process information, the efficiency and accuracy of the algorithm depend on how well the algorithm was trained. Training an algorithm takes time and CPU resources - the more so if the dataset is large and multidimensional. Thus, research on clustering techniques that are scalable on large multidimensional data is important in order to reduce the computation time and utilize far fewer CPU resources. I want to take part in research on data mining, specifically on developing new clustering algorithms that are scalable on the large multidimensional dataset using metaheuristics.

Since the MSIE program requires an overall 28 credits, four of which is compulsory for research, I would like to divide it to the following courses: three basic courses (500 level courses), four advanced courses (600 level courses), an independent study in a specific research area (700 level courses), and a research course. My end goal is to take Master Thesis Research (IMEN699) but I will be needing a stronger foundation in statistics and intelligent systems during my first year to prepare for my research. With that, I choose the following subjects: Design and Analysis of Experiments (IMEN 542), Time Series Analysis (IMEN677), Expert Systems (IMEN584), and Advanced Artificial Intelligence (IMEN683). Also, I would like to take Information Modeling (IMEN695) and Distributed Information System (IMEN781) to gain a thorough understanding of the data appropriate to the needs of today's data science. Lastly, I want to take the opportunity of studying at POSTECH to learn about entrepreneurial related courses in industrial and management engineering such as Product Development Strategy (IMEN595) and Technology Planning (IMEN611) to learn about Korean case studies on company leaders making decisions and their outcomes, as well as their strategies in technology and innovation.

kristianjb

Dec 26, 2018

Graduate / Personal Statement for KGSP (Graduate): Application to POSTECH [3]

Hello, I would like to seek help with my essay. This is a personal statement for my application for graduate studies at POSTECH.
I'd like to hear some feedback regarding grammar and structure. As well as the coherence of ideas in my essay. Please also indicate if I need to add something to strengthen my case. Thank you!

PERSONAL STATEMENT

o Motivations with which you apply for this program
o Your education and work experience in relation to the KGSP
o Reason for studying in Korea
o Any other aspects of your background and interests which may help us evaluate your aptitude and passion for graduate study or research.

Over the last years, there has been a growing demand in the career opportunities for data scientists. With that in mind, I would like to complement my four years of work experience in the IT industry, particularly in Java web application development with a graduate degree in Industrial Engineering and Management at Pohang University of Science and Technology (POSTECH). I obtained my Bachelor's Degree in Applied Mathematics Major in Operations Research from the AB University last June 2015. During my studies, I have learned various techniques in operations research including Optimization, Statistics, Numerical Analysis, Queuing Theory, Decision Theory, etc.

During an internship in a geothermal plant, I was given the chance to apply my learnings in a case study. I was tasked to investigate the pros and cons of merging three laboratories to minimize cost. In the beginning, I had difficulty choosing which method to use because of how the problem was broadly stated. But over the course of time, I was able to narrow the scope into studying the queuing of chemical samples from the three laboratories in the hopes of addressing the problem. However, I was not able to produce conclusive evidence regarding the effects of the merging of the three laboratories due to inconsistencies with the data of one of the laboratories. Still, the study revealed that additional manpower to one of their laboratories, and reallocation of the workload in another laboratory would be beneficial to their operations.

In addition to queuing analysis, the internship has also taught me three important points when it comes to handling data: (1) data collection, (2) data cleaning, and (3) data distribution. In data collection, it is important to identify the nature of the dataset. Since different mathematical models require varying assumptions regarding the data, I had to be careful when communicating with the chemists. This is to ensure that I am able to identify the correct set of data to perform queuing analysis. In data cleaning, I have found a couple of chemical samples that were missing parameters such as analysis end time while others had an unusual process time which extremely deviated from the median. I have managed to exclude them from the computation. In data distribution, being able to run tests and identify a specific data distribution has a significant effect on which queuing model to use. In general, data distribution dictates the possible minimum and maximum values the dataset could have and could also be helpful in identifying outliers.

For the past 5 years, POSTECH has consistently made it to the list of top Asian universities, and have a good reputation when it comes to research particularly the Department of Industrial Engineering and Management. I believe that studying in POSTECH would help me further develop my research skills in the field of Big Data, particularly on clustering algorithms. My undergraduate thesis was about a hybrid algorithm between a density-based clustering algorithm and a nature-inspired metaheuristic algorithm entitled XXXXX. Given the chance, I would still like to continue doing research on these kinds of topics.

The idea of living abroad excites me not only because of the great opportunities that lie ahead but also because of the invaluable life experiences it has to offer - a sense of independence, friendship across borders, and expansion of the professional network. Last 2013, I had the opportunity to become a short-term exchange student in Japan for 10 days. That experience made me realize that the world is big and that I should not confine myself within the bounds of my comfort. Living in C. City I encounter a lot of Koreans on a daily basis. They must have liked the beautiful beaches here in our country. I am also hoping to go to Korea someday to appreciate what the country has to offer and through KGSP, I am eager to learn the Korean language Hangeul to better understand and appreciate the Korean culture.