Redesigning Speak Like A Brazilian

Apr 20, 2016 in css, laravel, speak-like-a-brazilian | blog

Three weeks ago we released a new version of Speak Like A Brazilian. Most changes are not visible to users, mainly improvements to the framework used to build it (Laravel).

But one major change that is visible to users is the new layout and CSS, powered by Semantic UI, which replaces Bootstrap.

Screenshot SLBR v2.1
Screenshot SLBR v2.2

The frameworks have some parts that overlap, but Semantic UI has more components that can be easily used. There are tons of examples, plug-ins and snippets of code for Bootstrap, but sometimes they do not match your current layout, or become deprecated with new releases.

This was the main reason for using Semantic UI instead of Bootstrap. Coincidentally, the next major change in Speak Like A Brazilian is adding semantic web tags to the site. These tags are useful for search engines, for displaying extra information in search results, but also for linked data sites to index our content.

Stay tuned!

Using Nikto to scan a website for vulnerabilities

Apr 09, 2016 in laravel, security, nikto, speak-like-a-brazilian | blog

Nikto is a web server scanner, written in Perl and licensed under GNU GPL. The source code is available at GitHub.

We released a new version of Speak Like A Brazilian last week, upgrading from Laravel 4.x to 5.x. Once we had a release candidate (RC) tag ready we deployed it and ran Nikto to check for issues in the server.

The default settings are enough for an initial test, but you can take a look at nikto —help for other interesting parameters.

nikto -host

Here are the results of the scanning.

- Nikto v2.1.4
+ Target IP:
+ Target Hostname:
+ Target Port:        80
+ Start Time:         2016-04-08 12:57:02
+ Server: nginx
+ Retrieved x-powered-by header: PHP/5.5.9-1ubuntu4.14
+ No CGI Directories found (use '-C all' to force check all possible dirs)
+ robots.txt contains 11 entries which should be manually viewed.
+ ETag header found on server, fields: 0x5704861f 0xef 
+ OSVDB-3092: /new: This may be interesting...
+ OSVDB-3092: /new/: This might be interesting...
+ OSVDB-3093: /.htaccess: Contains authorization information
+ OSVDB-3092: /de/: This might be interesting... potential country code (Germany)
+ OSVDB-3092: /it/: This might be interesting... potential country code (Italy)
+ OSVDB-3092: /jp/: This might be interesting... potential country code (Japan)
+ OSVDB-3092: /pt/: This might be interesting... potential country code (Portugal)
+ OSVDB-3092: /es/: This might be interesting... potential country code (Spain)
+ 6448 items checked: 39 error(s) and 11 item(s) reported on remote host
+ End Time:           2016-04-08 13:29:41 (1959 seconds)
+ 1 host(s) tested

There list had a few false positives, but two outstanding issues could be easily fixed. First the .htaccess file.

# File: speaklikeabrazilian.vhost.conf
# From:
# Block access to .htaccess files
location ~ /\.ht {
    deny  all;

And the second one is to hide the PHP server signature.

# File: php.ini
# From:
expose_php = Off

It is not polite to run Nikto (or most other scanners to be honest) on web sites that you do not maintain. Some sites are behind firewalls and IDS (Intrustion Detection System) systems that may block Nikto. When that happens, you can liaise with your Ops team and schedule an internal run on an application you develop or maintain.

Nikto may help you finding flaws in your server and deployment. The next step now for Speak Like A Brazilian now is running the new code through OWASP’s ZAP Proxy, which can analyse the HTTP requests for possible bugs.

Happy hacking!

Simple ETL with Talend and Docker for a Laravel website

Apr 07, 2016 in etl, talend, laravel, docker, speak-like-a-brazilian | blog

A couple of years ago when I worked at a credit bureau, I learned how important Data Management is for a company. There are several processes involved in data management, such as Record Linkage, Data Cleansing, Master Data Management among others.

One of the most common tasks in data management, common in many companies, is ETL. ETL stands for Extract, Transform, and Load. Basically, it consists in retrieving existing data, doing some transformation (remove columns, add columns, sum values, etc), and then loading the result data back in to some other database.

We just finished migrating Speak Like A Brazilian from Laravel 4.x to Laravel 5.x, and besides the necessary changes for adapting the existing code to the new version of the framework, we have also tackled other issues in the backlog, such as SSL, using a new e-mail service and a few changes in the database schema. Due to this last item, we had to do a small ETL.

Talend Open Source Data Integration

You can download the free and Open Source Talend Open Studio for Data Integration. After installing it, you can read the documentation or watch some videos with short tutorials.


But the interface can be quite intuitive, with drag and drop items, help pop up dialogue boxes, and a help available with the local tool. There are other tools such as Luigi and Pentaho Kettle that could have been used for the same task. It also supports several databases out of the box.

Talend has other free and commercial tools, that can be used for other tasks and processes in Data Management, and solutions that can also work with Big Data.

On Using Docker

Since the database is quite small (< 10 MB), using Docker seemed like a good idea to avoid installing MySQL or MariaDB locally and having to configure databases and move data around.

But before starting containers, we needed a dump of the existing database.

mysqldump -u user -p pass > slbr_20160403.sql

And now we could load the data into a new container, using the mysql:latest Docker image.

Transfer the data from the server to local computer, and start a container where we will store and massage the data.

docker run -p 4406:3306 -v $PWD/Downloads/slbr:/opt/data -v $PWD/Development/php/workspace/ --name slbr -e MYSQL_ROOT_PASSWORD=nosecret -d mysql:latest

Then we need to jump into that container to run a few commands to load the data.

docker exec -ti slbr /bin/bash

mysql -uroot -p -e "create database slbr"
mysql -uroot -p slbr < /opt/data/slbr_20160403.sql

Finally, make sure the application new database is ready as well. It is already using Docker Compose, so all it takes is start the containers with the following command:

docker-compose up

And then create the database structure with initial data.

docker exec -ti speaklikeabraziliancom_front_1 /bin/bash

php artisan migrate:install
php artisan migrate:refresh --seed

Done, now the old database is running in one container of the mysql:latest image, listening to port 4406 and the new database is running with Docker Compose, listening for connection on 3306.

Creating Talend jobs to migrate the data

We start by creating a blank project in Talend.

ETL with Talend and Docker
ETL with Talend and Docker

And then creating the database connections for the two databases.

ETL with Talend and Docker
ETL with Talend and Docker
ETL with Talend and Docker
ETL with Talend and Docker
ETL with Talend and Docker
ETL with Talend and Docker
ETL with Talend and Docker
ETL with Talend and Docker

Now it is just a matter of dragging and dropping the two connections on to the main canvas. After that, we still need one more component: a map or tMap. This allows you to map columns between databases, doing transformations like running SQL functions or Java code.

ETL with Talend and Docker
ETL with Talend and Docker
ETL with Talend and Docker
ETL with Talend and Docker

In our case, the database for Speak Like A Brazilian had a few columns removed. These columns were simply not mapped into the destination database. Running the job loaded the data, but excluding the old columns.

There was also one column added with a defaul value, for which the tMap was also used.

You can play with Talend, use row filters and other components to do different data modifications, and also use other features such as job schedule or combine with other Talend tools.

We could have used Python, Shell, or other tools for the job, but using a tool tailored for ETL is quite handy and easy. Specially in case you ever need to do that again, against different databases. All it takes is just adjust the database connections and re-run the jobs.

DMC Latam 2014 dia 2 (pt-BR)

Aug 14, 2014 in data management, events | blog

O segundo dia do DMC Latam 2014 começou com um dos dias mais frios em São Paulo. Mas dentro do Hotel o clima estava ótimo, e ainda tinha bastante café preto e bem quente. Percebi que o pessoal estava mais a vontade para trocar cartões e puxar assunto na área livre.

Sala Principal
Sala Principal

DMC Latam 2014 dia 1 (pt-BR)

Aug 13, 2014 in data management, events | blog

O DMC Latam 2014 começou nesta quinta-feira, 13/08. O evento está sendo realizado no mesmo local do ano passado, na Rua Martins Fontes, 330, no Hotel Braston. Este ano consegui chegar cedo e peguei o café da manhã.

DMC Latam Coffee Break
O Café da Manhã do DMC Latam 2014

A abertura deste ano teve um narrador, daqueles com voz de filmes, explicando a agenda do evento e fazendo os agradecimentos aos patrocinadores. Seguido pelo Rossano falando sobre realizações da DAMA desde o ano passado e sobre as palestras do dia.

O DMC Latam deste ano distribuiu um brinde na entrada para todos os participantes e também um exemplar do livro “A função do Chief Data Officer - Reorganizando os cargos executivos para alavancar o seu mais valioso ativo”.

DMC Latam Coffee Break
Sala principal