Page 1 of 1

"noindex" for nearly all pages

Posted: Tue Jul 31, 2018 1:50 am
by stephanbrunker
This is the problem from the german subforum: viewtopic.php?f=10&p=10450914&sid=a202a ... #p10450914 and i was able to hack it. Because I don't know if what I've done is right and I think this is a bug to fix, I repost it here.

Basically, all my pages in my blog weren't indexed by Google because of the "noindex" header. Only the static pages are indexed. I followed the problem through the source and one piece of the problem are the these lines in the index.tpl (I use the 2k11 theme):

Code: Select all

{if ($view == "entry" || $view == "start" || $view == "feed" || $view == "plugin" || $staticpage_pagetitle != "" || $robots_index == 'index')}
    <meta name="robots" content="index,follow">
{else}
    <meta name="robots" content="noindex,follow">
{/if}
That means that either the $view is not one of possibilies for index and/or $view is not defined at all. The latter could be true, because in the transfer between the $serendipity[] array to the $serenditipy['smarty']->assign (functions_smarty.inc.php on line 1062) the parameter 'view' is missing:

Code: Select all

function serendipity_smarty_init($vars = array()) {
	...
        $serendipity['smarty']->assign(
            array(
                'head_charset'              => LANG_CHARSET,
                'head_version'              => $serendipity['version'],
                'head_title'                => $serendipity['head_title'],
                'head_subtitle'             => $serendipity['head_subtitle'],
                'head_link_stylesheet'      => $serendipity['smarty_vars']['head_link_stylesheet'],
                'head_link_script'          => $serendipity['smarty_vars']['head_link_script'],
                'head_link_stylesheet_frontend' => $serendipity['smarty_vars']['head_link_stylesheet_frontend'],

                'is_xhtml'                  => true,
                'use_popups'                => $serendipity['enablePopup'],
                'use_backendpopups'         => $serendipity['enableBackendPopup'],
                'force_backendpopups'       => $force_backendpopups,
                'is_embedded'               => (!$serendipity['embed'] || $serendipity['embed'] === 'false' || $serendipity['embed'] === false) ? false : true,
                'is_raw_mode'               => $serendipity['smarty_raw_mode'],
                'is_logged_in'              => serendipity_userLoggedIn(),

                'entry_id'                  => (isset($serendipity['GET']['id']) && is_numeric($serendipity['GET']['id'])) ? $serendipity['GET']['id'] : false,
                'is_single_entry'           => (isset($serendipity['GET']['id']) && is_numeric($serendipity['GET']['id'])),

                'blogTitle'                 => $serendipity['blogTitle'],
                'blogSubTitle'              => (!empty($serendipity['blogSubTitle']) ? $serendipity['blogSubTitle'] : ''),
                'blogDescription'           => $serendipity['blogDescription'],

                'serendipityHTTPPath'       => $serendipity['serendipityHTTPPath'],
                'serendipityDefaultBaseURL' => $serendipity['defaultBaseURL'],
                'serendipityBaseURL'        => $serendipity['baseURL'],
                'serendipityRewritePrefix'  => $serendipity['rewrite'] == 'none' ? $serendipity['indexFile'] . '?/' : '',
                'serendipityIndexFile'      => $serendipity['indexFile'],
                'serendipityVersion'        => ($serendipity['expose_s9y'] ? $serendipity['version'] : ''),

                'view'                      => $serendipity['view'],
                'lang'                      => $serendipity['lang'],
                'category'                  => $category,
                'category_info'             => $category_info,
                'template'                  => $serendipity['template'],
                'template_backend'          => $serendipity['template_backend'],
                'wysiwygToolbar'            => $serendipity['wysiwygToolbar'],
                'wysiwyg_customPlugin'      => $wysiwyg_customPlugin,
                'wysiwyg_customConfig'      => $wysiwyg_customConfig,
                'use_autosave'              => (serendipity_db_bool($serendipity['use_autosave']) ? 'true' : 'false'),

                'dateRange'                 => (!empty($serendipity['range']) ? $serendipity['range'] : array())
            )
        );


After inserting there the line

Code: Select all

                'view'                      => $serendipity['view'],
the start page changed to 'index,follow'. But not the entries linked on the start page. Because one of the values for 'view' is 'archives' and looked attractive i tried to add '|| $view == "archives" ' in the index.tpl (2k11) line 12:

Code: Select all

{if ($view == "entry" || $view == "start" || $view == "archives" || $view == "feed" || $view == "plugin" || $staticpage_pagetitle != "" || $robots_index == 'index')}
    <meta name="robots" content="index,follow">
{else}
    <meta name="robots" content="noindex,follow">
{/if}
and then all the full entries changed to 'index,follow' too. Now I could init a new crawling by Google and hopefully, all the 132 skipped pages are now going to be indexed.

I don't know what the origin of this bug is - I simply cannot be true that all pages powered by s9y since forever weren't ever indexed by Google - but I think it important to fix asap.

For the plugins, I only use serendipity_event_multilingual and I don't think that this one is guilty ...

Re: "noindex" for nearly all pages

Posted: Tue Jul 31, 2018 9:54 am
by onli
Thanks for the report, but we have to look a bit further into this. This function definitely works in my blog (https://www.onli-blogging.de/), and also in some other s9y blogs I know (https://yellowled.de/ for example).

It is correct that the view variable is not set in functions_smarty.inc.php, but that can also just mean it is set somewhere else. It is not necessarily set as $view, it can also be part of an array or directly given to smarty... Got it, it is set in genpage.inc.php, see https://github.com/s9y/Serendipity/blob ... nc.php#L17:

Code: Select all

$uri_addData = array(
    'startpage' => false,
    'uriargs'   => implode('/', serendipity_getUriArguments($uri, true)),
    'view'      => $serendipity['view'],
    'viewtype'  => isset($serendipity['viewtype']) ? $serendipity['viewtype'] : ''
);
This is then given to smarty in https://github.com/s9y/Serendipity/blob ... .php#L1108.

There are some things that could go wrong here: What is actually given to smarty_init is $serendipity['plugindata']['smartyvars'], and the genpage hook event is executed before, you could have a plugin installed that manipulates that variable. It is also possible smarty_init aborts before setting the variable, one reason is that in your case smarty might be already initialized. That could also happen because of a plugin listening to genpage and helpfully initializing smarty for you.

So, this is not a clear-cut bug in the core. But we should improve the reliability of this by at least making sure that $serendipity['plugindata']['smartyvars'] is always given to smarty, also if smarty is already initialized.

Could you test that for me? Remove your addition to smarty_init for now. Then, in genpage.inc.php, replace line 27

Code: Select all

serendipity_smarty_init($serendipity['plugindata']['smartyvars']);
with

Code: Select all

serendipity_smarty_init();
if (count($serendipity['plugindata']['smartyvars']) > 0) {
    $serendipity['smarty']->assign($serendipity['plugindata']['smartyvars']);
}
Now that should work, and if not we should be able to pinpoint it to a plugin listening to the genpage event.

Re: "noindex" for nearly all pages

Posted: Tue Jul 31, 2018 8:28 pm
by stephanbrunker
Hello Onli,

your solution does indeed work. I did just some poking in the pond without understanding what I did. The question is - if it is an issue anymore - what plugin caused this.

As Sidebar plugins I use:
serendipity_plugin_staticpage
serendipity_plugin_multilinugal
serendipity_plugin_categories
serendipity_plugin_recententries
serendipity_plugin_syndication
serendipity_plugin_superuser

and event plugins:
serendipity_event_spartacus
serendipity_event_s9ymarkup
serendipity_event_emoticate
serendipity_event_nl2br
serendipity_event_spamblock
serendipity_event_customarchive
serendipity_event_staticpage
serendipity_event_multilingual
serendipity_event_statistics
serendipity_event_lightbox
serendipity_event_entrypaging
serendipity_event_entryproperties

Re: "noindex" for nearly all pages

Posted: Tue Jul 31, 2018 8:34 pm
by onli
Grep for 'genpage', that's the name of the hook. But actually many plugins catch that hook. customarchives is a candidate, whatever happens it will call smarty_init, and if my theory is right that's enough to trigger this bug. But I'd need to debug it more to be certain. For now I'd just push the patch you tested, thanks for that!

Re: "noindex" for nearly all pages

Posted: Mon Nov 04, 2019 10:24 am
by Huhu
I encountered the same problem. I just bluntly changed "noindex" to "index" to avoid any further problems, and since that should be no problem with the installation in question, since there is no comment function active there.

Problem is that this change would certainly vanish with the next upgrade. So a patch in the 2k11 package would be much appreciated (if you can spare the time, that is)

Re: "noindex" for nearly all pages

Posted: Mon Nov 04, 2019 10:30 am
by onli
Huhu, you really got the bug that your entries were set to be noindex?

Re: "noindex" for nearly all pages

Posted: Tue Nov 12, 2019 1:45 pm
by Huhu
Yes, all individual entry pages have been set to "noindex" in the header section. ("noindex, follow")

Re: "noindex" for nearly all pages

Posted: Tue Nov 12, 2019 2:20 pm
by Be@t
Huhu wrote:
Tue Nov 12, 2019 1:45 pm
Yes, all individual entry pages have been set to "noindex" in the header section. ("noindex, follow")
Do you mean with "entry pages" a menu which is related to a category? If so, this is a normal behavior and don't mean that your posts (within a category) will not be indexed.

PS: I like your site. Would you mind to present it in "Showcase"?

Re: "noindex" for nearly all pages

Posted: Tue Nov 12, 2019 2:24 pm
by onli
Indiviual entry means like the actual entry, which is really bad. It would mean that the $view variable is not set, despite the patch, which is actually already in the core. If that's the case and we can't find the core issue maybe we need to swap it around and set it to index,follow by default and only set nofollow to specific pages, liek the archive, tags and category overviews.

Though I assume it's probably a faulty plugin overwriting it via the genpage event hook. Hulu, can you grep for that event in your installed plugins and see whether you use the same plugins as stephan?