Welcome to dbForumz.com!
FAQFAQ    SearchSearch      ProfileProfile    Private MessagesPrivate Messages   Log inLog in

Parsing content for links

 
   Database Forums (Home) -> PHP RSS
Next:  Is it possible to submit two forms at a time  
Author Message
Tony

External


Since: Jun 19, 2006
Posts: 1



(Msg. 1) Posted: Wed Feb 21, 2007 12:20 pm
Post subject: Parsing content for links
Archived from groups: comp>lang>php (more info?)

I have a content management system that has links within the content
field in the database and I need to verify if those links are correct.
What I need to have happen is have a php script query the database and
then parse through the content field to find all the <a href> tags to
get the href attribute value and the link text.

Does anyone have a way of doing this or a regex to do this?

Thanks,
Tony

 >> Stay informed about: Parsing content for links 
Back to top
Login to vote
Arjen

External


Since: Feb 18, 2006
Posts: 6



(Msg. 2) Posted: Thu Feb 22, 2007 9:24 am
Post subject: Re: Parsing content for links [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Tony schreef:
> I have a content management system that has links within the content
> field in the database and I need to verify if those links are correct.
> What I need to have happen is have a php script query the database and
> then parse through the content field to find all the <a href> tags to
> get the href attribute value and the link text.
>
> Does anyone have a way of doing this or a regex to do this?
>

preg_match_all ("/a[\s]+[^>]*?href[\s]?=[\s\"\']+".
"(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/",
$html, &$matches);


--
Arjen
http://www.hondenpage.com - Mijn site over honden

 >> Stay informed about: Parsing content for links 
Back to top
Login to vote
Curtis

External


Since: Jan 30, 2007
Posts: 10



(Msg. 3) Posted: Thu Feb 22, 2007 9:30 am
Post subject: Re: Parsing content for links [Login to view extended thread Info.]
Archived from groups: per prev. post (more info?)

Tony wrote:
> I have a content management system that has links within the content
> field in the database and I need to verify if those links are correct.
> What I need to have happen is have a php script query the database and
> then parse through the content field to find all the <a href> tags to
> get the href attribute value and the link text.
>
> Does anyone have a way of doing this or a regex to do this?
>
> Thanks,
> Tony
>

Yeah, regex would be easiest, and there should be plenty out there,
but I might do something like this:

$re = '%
<a[^<>]+ # href may or may not come first
href=([\'"]) # capture single/double quote

# match a valid URI
(
[\w.-]+Sad?://)? # scheme
[^?"]+ # authority

# possible query string and fragment
(?:
\\? [^#]+
(?: \\# [^"]+ )?
)?
)

\1 # captured quote from above
[^<>]* # possible remaining attributes
>( .*? ) # allow for nested tags
</a> # closing <a> tag
%xi';

The match for the URI would be in $match[2] and the text for the <a>
tag is in $match[3].

Just use this $re var in the preg_* functions.

Hope this helps,
Curtis
 >> Stay informed about: Parsing content for links 
Back to top
Login to vote
Display posts from previous:   
Related Topics:
Parsing SVG file for PHP content - Apache - Hi all, I have some SVG scripts I made a couple of years ago, which contain some PHP code to pull data from a SQL Server DB. These scripts ran OK when they were in an IIS environment (where it's easy to assign a script-type to a particular process or..

Popups from links - Heya all I have a script that allows my website to display the contents of a folder as links. I want those links to open as their MIME type (MP3 files) in a small window, displaying that file's name as an H2 tag. Can someone help me out? Here's the..

Search page result links - Hi, I have a search function pulling data from a MySQL db which works nicely - other than the fact that I can't get the URL to link correctly when the linked document is located in the home directory. All other docs are located below the home..

Problem with links with form data - Less than 1% of my users say they can not open links like this: index.php?variable=value&variable2=value2 Is anyone aware of some firewall/security setting that would prevent people from clicking on such types of links?

add content to word or pdf file using php - Hello everybody! I would like to have some code in php, which would add some content (e.g. one line) in a ms word or a pdf file (whichever is easiest) that is already stored in the website. To be more specific: I am trying to develop a form that the..
   Database Forums (Home) -> PHP All times are: Pacific Time (US & Canada)
Page 1 of 1

 
You can post new topics in this forum
You can reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



[ Contact us | Terms of Service/Privacy Policy ]