Category: Personal projects

Making a cloud Native webcrawler in Go

Post author By Admin
Post date May 12, 2024
No Comments on Making a cloud Native webcrawler in Go

Image of nodes connected in a graph. Showing how pages related to each other

Over the past few weeks I have been making a webcralwer. I wanted to do it as way to get better at Go with useful for learnings for Graph databases as well as being fun. The project made use of cloud native items such as AWS SQS, DynamoDB and optionally Neptune which could be swapped out for Neo4j.

What is a webcrawler?

A webcrawler or webspider is a program which visits a website, and fetches all of the links on that site and then visits them. This is how sites like Google/Bing/DuckDuckGo get the pages to populate when searching.

Architecture

Image of AWS services. SQS is drawn connected to the Webcrawler as well. There is also a connection to Neptune and AWS DynamoDB

SQS queue

The primary role of the SQS queue was to allow as a store of links to explore. It also if there is an issue processing the link we have a dead letter queue automatically which allows for a high level observability.
It also allows for the program to better scale horizontally. By Allowing multiple nodes to pick up the work.

DynamoDB

This acted as the long term storage of the data. Initially I tried modelling the links between sites inside DynamoDB however this did not work well with the access patterns due to 1MB limit on items in DynamoDB. DynamoDB works excellently as a NoSQL serverless solution for Key value look ups

Neptune/Neo4j

I also used a graph database to store the relationships between pages. This proved much easier than expected. Due to openCypher being a really beginner friendly tool very similar to SQL but allowing for very complex relationships to be models easily. I will be using graph databases in the future for other projects as well.

Take aways

For me it was a fun project which I learnt a lot about some of the major difficulties which Search engines encounter. Such as pages not being formatted correctly, or pages linking to subpages endlessly. In the future I want to build a searching mechanism on top of this to act as a crude full search engine.

For anyone curious here is link to the Github

Tags AWS, go

Misc Personal projects Tech

Deployment options for Rails

Post author By Admin
Post date November 28, 2019
No Comments on Deployment options for Rails

Over the past year, I have needed to deploy Ruby on Rails applications to the cloud, and I have tried a few methods.

Capistrano
This cool system and is fairly interesting, but not the most modern technique it creates files called cap files which then can be used to deploy the app over SSH. However the main issue being scaling, since it only deploys to one location it means auto scaling becomes difficult.
Heroku
Heroku is a PaaS system, meaning you don’t have to deal with server configuration, the application runs inside a specially configured docker container. In terms of ease of use Heroku scores very highly, Heroku makes it very easy to deploy an Application to production. However it being a platform you pay a premium for the ease of use.
Docker
This methods seems the most hands on but it grants the most options, and probably the most modern.
My recommendation is to create a deployment docker file, which can build the file from a Git repository. Then its your choice if you use something like kubernetes, or Docker Swarm to create other containers with the app running.
Dokku
Dokku is an interesting project. The program uses docker containers in the backend and configures them. So in a way you can host your own PaaS system, it requires some configuration, however It can lead to a very effective PaaS System.

In the end, the type of deployment you do depends on the requirements of the project.

Code News Personal projects

New Code blog

Collie.codes is the new blog for Projects I am working on.

I have been keeping this on the back-burner for a while now, but I have finally gone live with a blog to keep people up to date with all of the projects which I am currently working on and all of the projects I have finished. In addition I made a Scrum board for my self, which works for each project listed on the website .

Which is very handy for keeping up to date with projects and seeing what needs to be done at a moments notice, I took inspiration from the TV show Silicon Valley, for the layout of this particular scrum board. It also in the backend as the ability to add keys to highlight particular tasks. EG Networking related things might be Green, but I didn’t end up adding to the final build since I am the only one using it so the tasks are pretty clear to me what are they fall under.

Code Misc Personal projects

Dev-ops on an iPad

The surprising power of an iPad

Recently I have been testing my skills/patience by trying to make and develop websites using an iPad (Pro 10.5 with smart Keyboard).

SSH

SSH initially seemed like a big hurdle since many web based SSH Clients are Java based Applets. However using an IOS app called Coda which comes with a SFTP/FTP client and SSH this was no longer and problem, however there are is probably a limitation with the usage of PEM, however I have limited usage with PEM, so your mileage may vary.

MYSQL

This is probably the weakest part of the whole experience, there are many apps however which can help with database management, in addition using phpMyAdmin might be the easiest and cheapest for most people.

Web development

A major problem which I found is that the iPad cannot run any server-side code (php, Ruby, Node.JS). So It was limited to just HTML and JS. However the workaround is to use a VPS, to use as a development server, and run and modify the code remotely, this is not ideal however at a pinch it is a viable alternative, especially if you create a disk image of a pre configured server, then all you have to do is start a VPS with the disk image and transfer the file you need to work on.

In iPad review

Though it is possible to deploy and run websites just from an iPad, it can be very frustrating especially when moving files around, however IOS 11 has helped the file experience on iOS. The experience is still sub par compared to a desktop. However it is nice to know it is possible, however I would only consider doing this with either an external keyboard or a smart keyboard since using the onscreen keyboard seriously limits usable space.

Tags Code, ipad, vps

Code Personal projects

Why I moved my hosting from shared to VPS?

Post author By Admin
Post date February 15, 2017

My situation

I was a tenant on a larger server. I was suffering a lot of outages which for my US readers was an issue. In addition the package I was on, limited my access to the server, so I could not add php code(Server side programming language) which limited the usability of the domain.

The VPS hosting provider

I had a few choices before I made the move do I do a windows environment, or Linux. I was on a windows provider and I did not see a reason to justify the cost of windows server license. Also I did not really see any benefits vs Ubuntu server LTS(which is what I am using now). The VPS host, I had used both Linode and DigitalOcean before I was more impressed by both. I chose DigitalOcean since I preferred its choice in servers, and configurations. Both offer SSD servers at reasonable prices. They can also allow me to easily add more servers if I need them in the future.

Worth it?

Hell yes!, I have full control over the server, so if I want to turn one into a game server at any time I can. I can easily run all the applications I want(Email murmur). It makes me much more flexible.

Is it for you?

Hard to say. If you are familiar with Linux and servers then probably, if not you will get very confused since the learning curve for server is sharp. I would recommend getting a raspberry pi loading Ubuntu server and seeing if you can run WordPress on it before you start paying for a VPS. In addition hold onto the old hosting provider for emails since setting up emails on a server can be a massive issue. Since there are many pitfalls like ip blocking which you can run into.

In the end it all down to personal preference.

Tags Linux, vps, why, Windows

Code Help Personal projects

Common mistakes and solutions when setting up WordPress

Post author By Admin
Post date February 15, 2017

The move from, shared hosting to personal VPS and the issues associated.

I moved the blog.alexcollie.com over to a personal VPS, which allowed for a greater deal of customisation.

However I was plagued with issues. Like the php code not being able to modify the WordPress files. All of these challenges are simply solved, but do take a lot of googling.

Common mistakes I made

This error with trying to connect Jetpack

jetpack Please contact your hosting provider to enable PHP’s XML extension.

It seemed like no one else had suffered the same issue.

The solution was really easy just do.(php 7 code)

apt install php-xml

or depending on apt version

apt-get install php-xml

Every time you try and install a or plugin or upload an image or post it said

To perform the requested action, WordPress needs to access your web server. Please enter your FTP credentials to proceed. If you do not remember your credentials, you should contact your web host.

This was a permissions issue were the php code did not have the rights to modify any of the files. Again it a really easy fix

chown -Rf www-data.www-data /var/www/html/

Just change ‘html’ to where the WordPress file is located.

I can think this page for this solution

When I tried to crop img inside WordPress it could run into errors.

sudo apt-get install php-gd

sudo apt install php-gd

This adds a library which allows php to crop images.

Despite all of these errors. Linux/GNU is amazing.

Tags contact, enable, Errors, extension, FTP, Help, hosting, jetpack, jetpack Please contact your hosting provider to enable PHP's XML extension, Lunix, php, PHP's, Please, provider, to, Wordpress, xml, your

Personal projects Tech

My experience building a 3d Printer

Post author By Admin
Post date August 14, 2016

Times are changing

So when I started building my 3d printer, I had an idea of what 3d printers were, and how they could be useful. However, after finishing it I have come to the realisation that 3d printers are so much more, there is nothing like the feeling of power, of seeing an object from a film or TV show and then just deciding lets build that object. Then, 5 hours later you have it in your hands. Calling it a 3d printer for me is too simplistic, since printers to me just creates objects to be viewed but a 3d printers on the other hand can create objects that can be interacted with. I prefer the term 3d fabricator, which has the added bonus of sounding like it is straight from a sci-fi film.

Building a 3d printer was remarkably easy for me at least there was no direct soldering, but a lot of bug testing when I thought I had finished the build.

I can 100% see a more refined version in many peoples houses in a few years. A 3d printer is such a practical object with so many applications. I found out the other day I that the international space station has now installed a 3d printer, so they don’t need to wait for supplies to install new hardware.

However,

As it currently stands 3d printers have a few limitations. They are big and noisy; they often require a lot of technical knowledge. They also, have a few material limitations plastics are currently the easiest the the most commonly available specifically, PLA and ABS. However, for wood and other materials it will might require a more specialised extruder head which can add the the already high costs of a 3d printer.

All things considered, it was worth me it was interesting peak inside the 3d printing world, and I hope to see lots more development in the future.

Here is a video of my 3d printer building the Gherkin.

Tags 3d, cheap, make, peronal, printer