Recently I have wondered whether using InfoCards can help reduce the amount of comment spam. Granted, I am fully aware they are not meant to solve the spam problem, but I was still curious about their effectiveness. Comment spam is a constant nuisance for both bloggers and forum owners. It is so bad that typically either the ability to leave comments is turned off or user registration to the site is required. In InfoCard terms, I will be using self-issued cards (probably the most common type that will be seen for this use) as the means of authentication. Unlike managed cards where the blog or forum site would trust a third party to validate claims, the site would simply be trusting the claims made by the end user. This is really no different than current registration schemes where the user just types in their information.
In addition to the requested user information, the site generally verifies the email address provided to insure that the submitting user controls the particular address. This holds true whether an InfoCard or traditional method is used. Email verification is usually performed by the site sending an email to the submitted address with a link, containing some identifier, that the user must click on or navigate to to verify their address. Once this is done, the site has verified the user's registration and allows the user to now login and add comments or post to the forum.
This past weekend, I wondered how easy it would be to automate this process (of course using PHP) with InfoCards and let me create comment spam. Needless to say that I found it quite easy and realized how important that the human factor must be taken into account. This means that I need to make sure I am verifying the registration of a LIVE person and not some automated routine. With the traditional method of user registration (you know where you actually have to type in all your information), it is common to have some form of captcha, making it very difficult create an automated process that is able to create a registration. Using InfoCards, there is no typing. Simply click on an image, select your card within the selector and the selector automatically submits it.
Automating my InfoCard Spammer
The first step was to simply create a SAML token and populate it with whatever name and email address I desired. Because the email address is verified before I am allowed to log in, it is required that a valid address be used. This is where those free temporary email services come in handy. It is possible to programmatically acquire a temporary (and checkable) email address. With the aid of the openSSL routines, I created a public/private key pair, made up a privatepersonalidentifier and signed the token. I really wasn't sure how this is generated so to hopefully make it more personal for me, took the hash of my newly generated public key.
The next step was to submit it to my targeted site. As this is a controlled environment, I am submitting to a specific known URL. In the real world it would not be difficult (as they already do it) to crawl the web and find URLs where InfoCards are accepted. Submitting the card requires my generated token be encrypted with the public key of the site I am submitting to. It took me a little digging, but found this little gem I was unaware of but added in PHP 5.1.4 -
capture_peer_cert option.
Accessing x.509 Cert from HTTPS connection
In order to access the x.509 cert from an HTTPS connection, you must create a stream context and enable the ssl capture_peer_cert option. Once a connection has been made to the site via HTTPS, the certificate is then accessible from the peer_certificate option. The following demonstrates accessing the certificate for my site and getting access to the public key.
$site_cert = NULL;
$context = stream_context_create(array('ssl'=>array('capture_peer_cert'=>TRUE)));
if ($fp = stream_socket_client("ssl://www.cdatazone.org", $errno, $errstr, 30,
STREAM_CLIENT_CONNECT, $context)) {
if ($options = stream_context_get_options($context)) {
if (isset($options['ssl']) && isset($options['ssl']['peer_certificate'])) {
$site_cert = $options['ssl']['peer_certificate'];
}
}
fclose($fp);
}
if ($site_cert) {
openssl_x509_export($site_cert, $str_cert);
$pubkey = openssl_pkey_get_public($str_cert);
}
Due to a bug in the openssl wrapper (fixed for upcoming 5.2.1) it is necessary to export the x509 cert and then get the public key from the exported cert. Once 5.2.1 is released, you will be able to simply call openssl_pkey_get_public($site_cert). With the sites public key, the token can now be encrypted and submitted to the URL.
Verification and Authentication
Once submitted to the site, we just need to wait for the verification email to arrive, parse it to locate any included URLS and then programmatically navigate to them. Whether the site is looking for a GET or POST, it is not difficult to automate the process. Considering that any type of authentication mechanism will be a plugin or add-on for a particular system, they will all generally require the same inputs thus making automation trivial. Once verified, we are free to post comments or write in the forums (as long as we programmatically maintain session).
Need for Human Interaction/Verfication
Based on this, I find it critical to interject some sort of human criteria check into this whole process. The user registration involves a single click, so its not simple to add a captcha at this point. Because the selector submits the InfoCard to the site, a captcha would either need to be entered prior to initiating the selector, or registration would need to be broken into at least two steps. It is also quite possible to implement a captcha during the verification process, but again the typical flow is that a single click on the verification link is enough since the humanness was determined during registration.
I really never gave it too much thought until recently and finally realize that just because an InfoCard was submitted and email address has been verified, doesn't mean there is someone really on the other end. The only good thing I can think of right now is that these are complicated technologies so will hopefully deter or at least slow down any adoption of InfoCard spamming - at least until I change my work flows
