TL;DR What you're asking would basically require a rewrite of the Sitemap plugin.
To add <lastmod> tags to a sitemap you would first need to update the function that creates the XML snippets to take a timestamp argument and add it to the XML output:
private function sitemap_output($request, $priority, $modified = NULL) {
echo "\t<url>\n" .
"\t\t<loc>" . qa_xml(qa_path($request, null, qa_opt('site_url'))) . "</loc>\n" .
(is_null($modified) ? '' : "\t\t<lastmod>" . $modified . "</lastmod>\n") .
"\t\t<priority>" . max(0, min(1.0, $priority)) . "</priority>\n" .
"\t</url>\n";
}
You'd also have to actually pass that information to the function, meaning you must adjust the database queries to fetch it from the database. The information shown in a sitemap can come from four database tables:
- questions
- users
- tags
- categories
The tables for tags and categories don't contain timestamps, so adding a <lastmod> tag for those is not feasible.
For users and questions the simplest approach would look like this
SELECT userid, handle, created FROM ^users WHERE ...;
and this
SELECT postid, title, hotness, created FROM ^posts WHERE ...;
respectively. However, that would not account for edited posts or user profiles, so you'd rather use the COALESCE() function to get the timestamp of the last update and fall back to the creation timestamp if that doesn't exist:
SELECT userid, handle, COALESCE(written, created) FROM ^users WHERE ...;
and
SELECT postid, title, hotness, COALESCE(updated, created) AS lastmod FROM ^posts WHERE ...;
But you also want the timestamp in a proper format in the XML, so you'd use something like
DATE_FORMAT(CONVERT_TZ(COALESCE(...), @@session.time_zone, '+00:00'), '%Y-%m-%dT%T+00:00')
instead of just COALESCE(...).
Then adjust the calls to sitemap_output() for questions and users, e.g.
$this->sitemap_output(..., ..., $question['lastmod']);
If you want newer pages first you need to reverse the processing order. The Sitemap plugin fetches database records in chunks of 100 and ordered by ID. Since the ID is a simple auto-incremented integer lower ID means older entry. You could change the sort order from ascending to descending to get highest ID first instead of lowest ID first, but beware that you must also adjust the condition from >= to <= for that to work:
SELECT userid, handle, DATE_FORMAT(...) AS lastmod FROM ^users WHERE userid<=# ORDER BY userid DESC LIMIT 100;
And you must adjust the calculation of the next ID in the PHP code to decrement as well.
However, that simple approach won't suffice, since it doesn't account for edits (for which the timestamp is stored in a different column, see above). You can't order by the result of a COALESCE() call, though, so you need something like an auto-generated row index to use instead of the ID:
SET @row := 0;
SELECT * FROM (SELECT (@row := @row + 1) as row, handle, lastmod FROM (SELECT userid, handle FROM ^users) t1 ORDER BY lastmod DESC) t2 WHERE row>=# LIMIT 100;
which is pretty ugly due to the nested sub-selects. I'm also not sure about how well it would perform for tables with many rows. Arguably a better approach might be to define views for the users and posts tables, fetch all data with a single query per table, and then sort the data in PHP.
As you can see, this is getting pretty complex pretty fast. And I'm not even touching splitting the sitemap here, which AFAICS would require creating different responses depending on the requested file.