Intro: Exploring Project Euler (#25)

This blog post marks a slight change in direction for me, as well as the introduction to what I hope will become a series here: "Exploring Project Euler."  It is my intention to publish multiple posts on this topic, so if you see some variation of that title multiple times on this blog just know that (1) these are distinct but loosely related posts, and (2) this particular post is the first installment, where I will dedicate some time to introducing and explaining the concept behind this series.  I plan on linking to this blog at the beginning of each subsequent post to prevent any confusion.  

To begin with, I will give an overview of what Project Euler is exactly.  It is a website dedicated to hosting a number of computation problems that can be solved using computer programming.  I will specifically be examining the archives which, as of this writing, contain 736 of these problems.  An interesting detail concerning these problems is that they are not listed randomly, but instead by order of difficulty.  The problems, beginning with number one, start relatively simple.  They then increase in difficulty as the number rises.  For some interesting context, 965,893 people have solved Problem #1, while Problem #736 has only been solved by 140 people as of this writing.  As an aside, I will leave a link to the website at the bottom of this post.  

Now that I've explained what Project Euler is, I can give some context as to why I would like to work with the website for this series.  The most general reasoning, as with many of my posts, is that I view it as an excellent learning opportunity.  Solving a problem and moving on to one that is more difficult means continuing to push my abilities and knowledge with regards to programming, and having a clearly defined and quantifiable progression pathway such as working my way up the archive numbers adds an element of fun to the process.  It also allows me to more accurately gauge my progress and probe for topics I should devote more time towards.  

As for what this has to do with data science, the answer is quite a bit actually.  When working with large datasets and complex machine learning algorithms, one of the most vital resources can be time.  Specifically, computation time.  The most accurate models in the world are not nearly as useful if they require an inordinate amount of time to complete their calculations.  One of the key features present in many Project Euler problems is that they focus on reducing computational complexity or working with obscenely large numbers.  They demand using efficient code or risking that your potential solution will result in a memory error.  Beyond that skill which can clearly be translated to data science applications, many of the problems deal directly with data science concepts.  Perhaps most importantly though, the problems require outside of the box thinking and learning new mathematical concepts, both skills that are incredibly useful to a data scientist or anyone working with numbers.  

So why start at Problem #25, you may be wondering if you read the title.  Unfortunately, the answer here is rather boring but also leads into the final reason why I've decided to use this website in particular:  I've solved problems on Project Euler in the past, and that is simply the point I reached before other work took priority.  This previous experience with the site serves as that final reason.  Although I didn't make it very far in the past, the problems I did work on taught me an incredible amount about both programming and mathematics, and I would like to continue that journey.  

In terms of structure, these posts will vary somewhat and this is definitely subject to change based on extenuating circumstances that may surround a particular Project Euler problem.  Even the number of problems addressed will likely not be the same in each post.  For example, this post will only examine a single problem since quite a bit of time was taken to explain the website and introduce the series.  Most subsequent posts will likely consider 2-3 problems, and this will vary based upon the amount of explanation required for the methodology and how much commentary is provided on the concepts covered in each problem.  Another quick point I would like to add is that I won't necessarily cover every single problem in order.  I may skip over covering problems that don't seem particularly interesting, ones that aren't related to any outside topics worth explaining, or quite simply ones where I hit a roadblock and cannot solve them.  In some cases I may still take a look at these problems I can't solve, for example if the roadblock that I run into teaches an interesting lesson or is caused by a problem that is worth exploring further.   In short, although this is technically a series, the installments will vary greatly and likely be focused more on the specific content of each question.

With all of the explanation of the website and series out of the way it is finally time to examine the first question from Project Euler.  The full question can be seen in the image below, and I will give a more detailed explanation underneath it.


So to put this in slightly more simple terms, the problem is requesting that we find the minimum amount of Fibonacci numbers that must be in a sequence in order for the final term to contain 1000 digits.  The above image somewhat explains what a Fibonacci number is, but I'll offer my own explanation as well.  A Fibonacci sequence starts at one, and the next number can be found by adding together the current number and previous number.  In this first case the current number is one, and there is no previous number so we add zero, making the next number in the sequence another one.  Now the sequence is {1, 1} and the current number is the second one, so we add that with the previous one to find the next number of the sequence, making it {1, 1, 2}.  Two is now the current number, and will be added to the one directly before it giving us {1, 1, 2, 3}.  This process goes on and on for as long as you would like to make the sequence.  Below is the code that I used to solve this problem.  I will explain the methodology that I used underneath that. 

n = 2
x = 1
y = 1
z = 0
i = 0
while i < 1000:
  z = x + y
  n+=1
  i = len(str(z))
  x = y
  y = z
print(n,":",z,":",i)

To start off I will explain what each of the initial variables represents here (this is the first five lines of code).  The 'n' represents the current length of the sequence.  I am initializing the sequence at {1, 1} with the first one represented by 'x' and the second by 'y', and starting at {1, 1} is the reason I initialized the length of the sequence as two.  The 'z' is what will represent the next number in the sequence, and 'i' represents the number of integers in the 'z' value.  

The purpose of the 'while' loop is to continue iterating through the sequence until the number of integers in the final value is 1000, at which point the loop will exit.  During each iteration of the loop, the first step is to add the 'x' and 'y' values which represent the current and previous numbers in the sequence and set 'z' equal to that sum, creating the next number in the sequence.  At this point one is added to the 'n' or index value since the current value of numbers in the sequence has gone up by one.  Next, the 'i' value is set equal to the number of integers in the 'z' value.  This is done by converting the 'z' into a string and counting the length.  Next, 'x' is set equal to 'y' and 'y' is set equal to 'z' so that the sequence can be advanced.  Once the 'i' value reaches 1000 the loop is exited and the code prints the current 'n', 'z', and 'i' values.  The 'n' value is the important one here as it represents the amount of numbers in the sequence.  

In the end, the result was that n = 4782 and when input into the Euler Project page it was marked as correct.  So there were two main takeaways from this problem, although both were fairly basic.  The first was a closer look at the Fibonacci sequence, as well as fairly quick code to calculate the sequence.  The second was setting up a 'while' loop to exit at the correct point, but this is also nothing new.   With the first problem down, it is now time to select the next ones I will tackle.  One option that I may consider is skipping ahead a ways to find some more challenging problems, but I will discuss this more in the next installment.  


Project Euler Archives

Comments

Popular posts from this blog

Credit Card Fraud Detection

Movie (Data) Magic: Reviewing the Critics