Validating HTML source using Selenium and w3.org

So, you have an Ajax app and want some way to perform HTML validation. How about using Selenium (to ensure your Ajax app is rendered) and W3.org’s validator?


The key bit around using the Validator’s API is this:-

public static void validateContent(String pageSource, String pageUrl) {
        try {
            List<NameValuePair> nvps = new ArrayList<NameValuePair>();
            nvps.add(new BasicNameValuePair("ss", "1"));
            nvps.add(new BasicNameValuePair("verbose", "1"));
            nvps.add(new BasicNameValuePair("uploaded_file", pageSource));
 
            CloseableHttpClient httpclient = HttpClients.createDefault();
            HttpPost httpPost = new HttpPost("http://validator.w3.org/check");
            httpPost.setEntity(new UrlEncodedFormEntity(nvps, "UTF-8"));
 
            CloseableHttpResponse response2 = httpclient.execute(httpPost);
            System.out.println(response2.getStatusLine());
            HttpEntity entity2 = response2.getEntity();
            String filename = URLEncoder.encode(pageUrl + ".html", StandardCharsets.UTF_8.name());
            FileWriter fw = new FileWriter(new File(filename));
            IOUtils.copy(entity2.getContent(), fw);
            fw.close();
            EntityUtils.consume(entity2);
            System.out.println(String.format("Wrote validation results to '%s'", filename));
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

CloseableHttpClient httpclient = HttpClients.createDefault();
HttpPost httpPost = new HttpPost("http://validator.w3.org/check");
httpPost.setEntity(new UrlEncodedFormEntity(nvps, "UTF-8"));

CloseableHttpResponse response2 = httpclient.execute(httpPost);
System.out.println(response2.getStatusLine());
HttpEntity entity2 = response2.getEntity();
String filename = URLEncoder.encode(pageUrl + ".html", StandardCharsets.UTF_8.name());
FileWriter fw = new FileWriter(new File(filename));
IOUtils.copy(entity2.getContent(), fw);
fw.close();
EntityUtils.consume(entity2);
System.out.println(String.format("Wrote validation results to ‘%s’", filename));
} catch (Exception e) {
throw new RuntimeException(e);
}
}

This takes the HTML source (as grabbed by Selenium) and calls the online validator. The results are written out to a HTML file which you can open in a browser.

Play nice and don’t hammer their API if you intend on using their Validator (they recommend sleeping for a second between hits, or alternatively installing your own copy).

A working example can be grabbed from GitHub.

This is my personal blog - all views are my own.

Tagged with: , ,