homepage Welcome to WebmasterWorld Guest from 54.161.246.212
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
Adding a string in the middle of a page
substring, word count, paragraph count
stipko



 
Msg#: 4584007 posted 11:47 pm on Jun 13, 2013 (gmt 0)

Hi all,

I want to add a $stringA into $stringB at the paragraph count mid-point.

In other words, $stringB has four paragraphs, delineated by eight <p> marks. After <p> mark number two, I want to insert $stringA, then continue with the rest of $stringB.

EXAMPLE:

$stringA = " ...<a href=aaa.html>Eat at joes</a>... ";

$stringB = "This is a paragraph of longer text. <p> There is more. <p> It continues. <p> And it ends <p>.";

$output = "This is a paragraph of longer text. <p> There is more. <p> ...<a href=aaa.html>Eat at joes</a>... It continues. <p> And it ends <p>.";


Basically, I am looking for a way to insert $stringA in the middle of every page in my CMS.

Thanks for any help given!

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4584007 posted 1:04 am on Jun 14, 2013 (gmt 0)

You've really got two different questions. One is how to locate the insertion point; this involves a Regular Expression of greater or lesser trickiness depending on what else is happening in the string. The other is the mechanics of text insertion.

You don't say how the page is brought into being in the first place. CMS implies php involvement at some stage-- but what exactly do you see, and what does the text look like at the stage when you're adding this string?

four paragraphs, delineated by eight <p> marks

I hope that means four <p> marks and a further four </p> marks for a total of eight. Will the inserted text form a new paragraph-- call it 2b --or does it go at the beginning of the present paragraph 3?


For reasons 2c2e, I recently had occasion to insert the word "ferret" after every six words in an e-text. Interesting exercise. Next time I'll code it differently.

stipko



 
Msg#: 4584007 posted 1:35 am on Jun 14, 2013 (gmt 0)

1. Thanks for responding.

2. Yes, I mean 4 of <p> and 4 of </p> totaling 8. Bad grammar. Fiancee was yappin when I was typing. Sorry.

3. The mechanics of inserting are pretty simple, but the locating of the paragraph(s) midpoint is my confusion. I shudder to thing I'd have to split the stringB up, insert the text, then rejoin the strings. Since this is for a CMS, the StringB is different all the time.

Hmmmm. THoughts?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4584007 posted 3:08 am on Jun 14, 2013 (gmt 0)

:: thinking out loud ::

When you say "paragraph midpoint" do you mean--

The center of some particular paragraph
Between the current paragraphs 2 and 3
End of paragraph 2
Beginning of paragraph 3
Some other possibility I haven't thought of?

Assuming for the sake of discussion that you mean "Immediately after the third occurrence of <p>" and that it's simply <p> not <p class etcetera>:

If your paragraphs were pure text, the search would be easy because you'd simply construct a Regular Expression that goes something like (<p>[^<>]+</p>\n+<p>[^<>]+</p>\n+<p>) OK, so the smilie generator also goes haywire when it meets > followed by ) and then pop in your string. But I have to assume there's other stuff in there-- formatting and anchors and anything else using <> characters. Now, obviously you can make a Regular Expression that goes ^(<p>.+<p>.+)(<p>.+<p>.+</p>)$ but, well, ugh, yuk.

Will your target string always contain exactly four paragraphs, no more and no less? What do you want to have happen if it's the wrong number? Or is this possibility so dreadful that your CMS simply won't build the page at all?

If your paragraphs run on from beginning to end without hard line breaks, then each paragraph can be treated as a single string in one-line-at-a-time mode. Then you're not really chopping up the string, you're going for substrings: do this stuff at the beginning of the third non-empty substring. That's how I'd do it in a text editor.

:: riffling through docs because I can't remember how you say "IndexOf" in php ::

Eck. I can't figure out how you'd get "the third occurrence of..." without cutting off the beginning of the string and then searching again.

This works in my text editor. Third attempt, I am sorry to say.

(<p>(?:[^<>]*</?[a-oq-z][^>]*>)*[^<>]*</p>\n*<p>(?:[^<>]*</?[a-oq-z][^>]*>)*[^<>]*</p>\n*<p>)

... and then stop and pop in your "Dinner Break!" or whatever-it-was substring.

If there exists some new HTML 5 tag whose name also begins in "p" I do not want to hear about it. And I REALLY hope the smilies will stay away from the real post, although they insist on cropping up in Preview. The noseless ;) sequence means winks are triggered by
>)
<)
&)
et cetera.

stipko



 
Msg#: 4584007 posted 5:30 am on Jun 14, 2013 (gmt 0)

Thank you for the valiant effort. I partially understand, and you partially understand.

I suspect my quandary is less complicated than you assumed; The string will be various numbers of paragraphs. There will be occasional markup in there.

I think the solution is to count the total number of <P> in the string, and then insert the InsertString after (total * .5)

Does that make sense?

Something along the lines of;

$stringA = " ...<a href=aaa.html>Eat at joes</a>... ";

$stringB = "This is a paragraph of longer text. <p> There is more. <p> It continues. <p> And it ends <p>.";

$totalNumber = magicalFunction($stringB); \\ result is "4" let's say...

\\ Something magical splits the string at the mid-paragraph-number point...

$newString = ($firstHalfOfString . $stringA . $secondHalfOfString);


Am I being more clear?

PS: I'm happy to pay for this help. I just need this puppy solved.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4584007 posted 8:19 am on Jun 14, 2013 (gmt 0)

Ah, that's what you mean by "midpoint". Total number of occurences of <p>, divide by two, next higher integer. If there are four paragraphs, you're inserting your stuff on the 4/2 + 1 = 3rd occurrence of <p>. If there are twenty paragraphs, it goes at the beginning of paragraph 11. If there are five paragraphs, do you want the third or the fourth? It's a question of whether you round to the next higher integer before or after you round to an integer at all. 4/2 + 1 = (4+1)/2, but 5/2 + 1 != (5+1)/2

:: further business with php dot net because I can't believe the function I'm looking for doesn't exist ::

Half of what you want is substr_count. That gives you the number of occurrences of <p>. It's the other half that's being elusive.

Viewed strictly as a regular expression:

(<p>(?:[^<>]*</?[a-oq-z][^>]*>)*[^<>]*</p>\n*){X}<p>)

Meaning "X occurrences of {long-and-complicated-pattern}, winding up with <p>". Get your substring, find its length, bisect the original string at that point. But that looks like three steps forward and two back.

This is probably just as well. The last time I answered a php question, a moderator took me aside and threatened to cut off my posting buttons if I ever tried anything of the kind again :(

Dideved



 
Msg#: 4584007 posted 7:44 pm on Jun 14, 2013 (gmt 0)

@stipko If this page is being generated by your own CMS, then by far your best option would be to insert string A during your normal template rendering process.

If for whatever reason that isn't an option, and you truly have no other choice but to parse and manipulate your own HTML, then your best bet is the DOMDocument class (http://php.net/manual/en/class.domdocument.php). Even the very best regular expression will have trouble accounting for all the possible variations that can occur in HTML. The DOMDocument class, on the other hand, uses a real HTML parser. Use the loadHTML method (http://www.php.net/manual/en/domdocument.loadhtml.php), then you can use normal DOM methods, such as getElementsByTagName, getElementById, childNodes, nextSibling, and appendChild and insertBefore.

Dideved



 
Msg#: 4584007 posted 9:13 pm on Jun 14, 2013 (gmt 0)

Even though using a regular expression is probably the least good hacky option for this problem, for funsies of the puzzle, here's how it could work. Turns out we can make it quite simple.

$stringA = "<a href=aaa.html>Eat at joes</a>";

$stringB = "<p>This is a paragraph of longer text. <p> There is more. <p> It continues. <p> And it ends.";

// count paragraphs
$nParagraphs = substr_count($stringB, '<p>');

// find the middle (round up)
$nHalfParagraphs = ceil($nParagraphs / 2);

$newStringB = preg_replace('/((?:<p>.*?){' . $nHalfParagraphs . '})(?=<p>)/', '$1' . $stringA, $stringB);

echo $newStringB;

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved