Tag: coursera

  • How I Developed a Next Word Prediction App for My Capstone Project

    How I Developed a Next Word Prediction App for My Capstone Project

    Last month, on Sept 10th, I finally finished the Capstone project for the tenth course of the Coursera Data Science Specialization. I had been staying up for more than four nights in a row, the latest to the second day at 5:30 am. I don’t remember when was last time I had done this since I graduated from my Ph.D. study.

    I remember fighting one after another challenges during this project. For so many times, I felt that I might just not have the character to be a data scientist. I had made so many mistakes during the process, yet those were also how I learned. Bit by bit, I kept correcting, tweaking, and improving my program.

    At one point, I thought I had a final product. Then I found that someone on the course forum had provided a benchmark program which can help test how well your app perform. Yet the process to plug in your own program to do the test itself was quite a bit of challenge.

    In the end, I saw a lot of other assignments which didn’t even bother to do it. However, plugging in my app into this benchmark program forced me to compare my app’s performance to others, which in turn forced me to keep improving and debugging.

    At the end, the result was satisfying: I created my own predict the next word app, and I am quite satisfied in comparison to the peers’ works that I have seen during the peer assignment reviewing:

    https://maggiehu.shinyapps.io/NextWords/

    My app provides not only the top candidates for next word prediction, but also provides the weight of each candidate, using the Stupid Backoff method. I also tried to model after the cellphone text type prediction function, which allows the user to click on the preferred candidate to auto-fill the text box. Below is a screenshot of the predicted top candidate words, when you type in “what doesn’t” in the text box.

    And here is the accompanied R presentation (The R Studio Presenter program implements very clumsy CSS styles which took me additional two hours, after the long marathon of debugging and tweaking the app itself. So I really wish that the course had not had this specific requirement of using R Studio Presenter for the course presentation.

  • Coursera Data Science Specialization Capstone Project – thoughts

    Coursera Data Science Specialization Capstone Project – thoughts

    Finally, I am at the capstone project — after three years of on and off working on this coursera specialization, I am finally here.

    The project is to give you a set of text documents, asking you to mine the texts, and come up your own model. So far, I am on week 2. I haven’t dived into the project deep enough yet, so don’t know how exactly I am going to mine the texts, and what kind of model I will be using. But since I was working on preparing for our 3-minute presentation of “what is your passion” last week, for our Monday team retreat at Leadercast, I came across the Maslow’s Needs Hierarchy. I think it would be neat to look at words in each level of the hierarchy, and see how frequent people use words in each hierarchy in their daily blog posts, tweets, and news.

    Maslow's Hierarchy

    To do this, I need to:

    1. Obtain a dictionary and have all words categorized into Maslow’s hierarchy
    2. Run all words in the files against the dictionary to determine which hierarchy they belong to.
      1. Calculate the frequency of each unique word
      2. Calculate the frequency of each level
    3. It would be fun to look at the frequency of each level in general; then look at the correlations between each level.
  • My Recent Thoughts on Coursera Courses

    My Recent Thoughts on Coursera Courses

    MOOC (Massive Open Online Course) has gained a lot of attention since 2011 and it was really really “hot” in 2012. Although the temperature of this topic  has dropped in 2014, the market revenue mechanism seems getting more mature, based on my observation on people who are taking courses on Coursera and paying for the certificates.

    I remember that the first time I noticed there were options of paid certificate for courses on Coursera. You can take still take all the courses for free, but you can also choose to pay about $49 (sometimes more or less) per course to get a completion certificate. When I first saw this, my reaction was: if learners need to pay for the certificates, should the courses still claim to be “Open”? In my mind back then, probably also in many others’, “open” equals to “free“. So I was against the idea of charging money for  MOOC (Massive Open Online Courses).

    However, my idea about “paid” MOOC courses has changed, after I started taking a series of courses on Data Science last Friday. I chose to take this series of courses because “Big Data” is hot right now, and I want to know more about how I can apply data mining strategies (possibly) in education. This course series seem to be a good fit. I noticed that the course series have the option of “Specialization” and I said to myself: let’s take the first course, and see what happens. After two days of working on the course, I felt the course was totally worth it. I paid the 49 dollars for the signature track of the course and plan to do so for the rest eight courses and the capstone project in this specialization series.

    Screen Shot 2014-05-21 at 1.16.38 PM

    Ok, now let’s look at and summarize what I learned about Coursera, MOOC, and paid certificates within MOOC:

    1. First of all, now looking at the price of $49, I think: “should we even call this a paid course if paying $49 for a certificate”? I’ve come across quite a few Coursera MOOC courses and I admit that some of them are not that good. But when they are good, they are good. So for a well organized, carefully designed and lectured course, with a lot of useful information, skills, and techniques that I’ve learned, I think $49 is merely nothing. Think about how much those online programs charge for a “program certificate” (here, here, and here). In this case, I think even with a paid certificate option, MOOC is still “free” and “open”.
    2. Coursera is really smart by restricting their course providers to accredited institutions (usually top institutions). When the time Coursera started, there were many other similar MOOC providers/platforms, such as Udacity, Udemy, edX, etc. By now, all of them have chosen their own business (or non-business) module. Coursera chose to stay with the traditional top institutions. This idea wasn’t very attractive at the first place, and didn’t sound very “open” and “grass-rooty” to me. But now that I think about my experiences of taking the courses, I think this might have been the simplest way to ensure the course quality. We can argue that non-top universities and institutions, or even personnels can also create high-quality MOOC courses, but to just select the top institutions as their course providers may be the smartest thing that Coursera has done so far – top institutions usually have more resources, in terms of the instructors, graduate assistants, the course materials development, etc. On the contrary, top institutions attract more learners because of their names in the traditional education field.
    3. Will the Coursera Certificate and its kind become more and more “useful” and “popular”on resumes and job market? I think so. I think the meaning of having a MOOC certificate is more than just taking an online course – that means that you were interested, you were willing to learn, and you took the action. The potential employers would definitely be more impressed with a job applicant’s resume who has MOOC certificates than one’s without. I think it is becoming a trend for life long learners to take online courses and therefore MOOC certificates is a future destiny too. I just took this Coursera poll this morning around 9:30 am EST on Wednesday, May 21, 2014 from the right side of Coursera’s homepage. I made a screen capture of the result after I clicked on “Submit” button because I think it is interesting. You can see from this screen capture, 33% of poll participants thought it is very important to earn a certificate after completing a Coursera Course, and 24% chose important, whereas another 24% chose Somewhat important. Although we can argue that the sample group itself is probably biased, since people who would take the poll might have already been very engage with the learning process via Coursera. Yet I think it backs my point here perfectly. Screen Shot 2014-05-21 at 9.37.27 AM
  • Big Data and Education

    Big Data and Education

    Screen Shot 2014-05-06 at 2.03.50 PMRecently I have been hearing people talking about “big data”. Supposedly it is a popular concept nowadays in the IT field. So I searched on Coursera and came across this course “Big Data and Eduction”. It was offered by Ryan Baker at the Columbia University in Oct 2013. Unfortunately it is no longer offered, but here is the archived course mateirals:  http://www.columbia.edu/~rsb2162/bigdataeducation.html

    I started watching the course videos and being finding different useful information. For example: the largest public data repository for educational software activities at PSLC data shop: https://pslcdatashop.web.cmu.edu/

    I think it would be interesting to run some data analysis based on certain data there and to see what can be “mined”.