amazon.sco

Use this forum for everything concerning script usage
Skunky
Newbie
Newbie
Posts: 17
Joined: Sat Apr 06, 2019 3:57 pm

Re: amazon.sco

Post by Skunky »

ok, on the way. I feel like it's gotta be something simple, but *shrug*

User avatar
jtclipper
Administrator
Administrator
Posts: 772
Joined: Tue Aug 10, 2010 12:04 pm

Re: amazon.sco

Post by jtclipper »

The html file you send me is a lot different than what I am getting,
Could be your OS (I tested with both win7 and 10), or an extension in IE stripping the style elements and inserting blanks and line breaks
As an example this is the artist line

Code: Select all

<DIV id=ArtistLinkSection class="a-section a-spacing-micro" sizcache="2" sizset="28"><A id=ProductInfoArtistLink class="a-size-medium a-link-normal" href="/dp/B000QKAJ14/ref=dm_ws_ps_adp">Clint Black</A> </DIV></DIV>

you file has 

<a class="a-size-medium a-link-normal" id="ProductInfoArtistLink" href="/dp/B000QKAJ14/ref=dm_ws_ps_adp">
The URL used is

Code: Select all

https://www.amazon.com/Out-Sane-Clint-Black/dp/B0868Z2R13/ref=sr_1_1?dchild=1&keywords=Clint+Black&qid=1605951677&s=dmusic&sr=1-1
Although it is possible to re write the script to handle this it is far better to find the route cause and try to get your IE to produce the correct output

edit*
try to search from within the TGF embedded IE and do not drag n drop from an outside source maybe that is the issue here

Skunky
Newbie
Newbie
Posts: 17
Joined: Sat Apr 06, 2019 3:57 pm

Re: amazon.sco

Post by Skunky »

this is win10, ie 11, I'm hitting "homepage", and then typing "clint black out of sane" in the amazon search box, no drag and drop. the only plugin that was installed for IE was adblock plus, and I've removed it. it still generates a 1.htm with lots and lots of blank space.

I still have the entry in regedit to make it use the "right" version of ie, which worked before. any suggestions?

User avatar
jtclipper
Administrator
Administrator
Posts: 772
Joined: Tue Aug 10, 2010 12:04 pm

Re: amazon.sco

Post by jtclipper »

The only way i could reproduce the issue was by drag n drop, on 2 separate installations with Win10 64bit one working and one fresh it works as intended
Could be something else on your installation which is causing this, if you can try it on a clean install or some other PC it would be nice
You can also try to clean your registry of that setting and use default setting on IE options

Skunky
Newbie
Newbie
Posts: 17
Joined: Sat Apr 06, 2019 3:57 pm

Re: amazon.sco

Post by Skunky »

could this be some amazon localization thing? because I get the same blocky lots of whitespace from 2 win10 boxes, a mac, and 3 linux boxes, using IE, safari, chrome, and firefox...

User avatar
jtclipper
Administrator
Administrator
Posts: 772
Joined: Tue Aug 10, 2010 12:04 pm

Re: amazon.sco

Post by jtclipper »

You can try to use the amg or discogs scripts and see if that works

Skunky
Newbie
Newbie
Posts: 17
Joined: Sat Apr 06, 2019 3:57 pm

Re: amazon.sco

Post by Skunky »

amg still works....

Skunky
Newbie
Newbie
Posts: 17
Joined: Sat Apr 06, 2019 3:57 pm

Re: amazon.sco

Post by Skunky »

Ok, so, it's been several years... with the recent backend refresh, I figured I might attempt to make something work, and maybe share a little.

the below is... hacky. not ready for prime time. there is a lot of code leftover from the original amazon.sco script here. it will get cover art, artist, album name, release year, and tracks (number, title, duration) only.
no comment, no publisher, no genre. it's not tested on multi-artist things at all. it does a bare minimum of data nabbing from amazon, but for me, it's enough.

to make it work... also can be a challenge.
you go to the homepage, and it takes you to the regular amazon

site search for something. anything that SHOULD be in amazon music's streaming library
2. search from main amazon site.png
2. search from main amazon site.png (533.17 KiB) Viewed 8991 times
get it to take you to the amazon music sub-site,
3. search inside of amazon music.png
3. search inside of amazon music.png (362.37 KiB) Viewed 8991 times
THEN you can search for what you want, find the right album,
and grab.
5. Grab and be happy.png
5. Grab and be happy.png (372.92 KiB) Viewed 8991 times

if you "sign in" to amazon, it will break.
if you change the URL at the top of the script to go directly to music.amazon.com, it will break in a different way.
I've not written a pascal program since 1991, so I'm kinda amazed I got this to work at all, honestly.

and, bottom line, none of this would have been possible at all without TGF in the first place... so, thanks!

Code: Select all

{https://www.amazon.com}


{?
 TITLE=FORM;field-keywords=%:str
 ALBUM=FORM;field-keywords=%:str
 ARTIST=FORM;field-keywords=%:str
/?}


Program Amazon_streaming;
const LIST_SEP = ',';
var
  slMain: TStringList;
//  iPos: integer;
  iRow: integer;
  sTmp: String;

  //---
  procedure GetImage;
  var
    sLine: string;
  begin
     if on_FindRow( iRow, 0, '<music-detail-header image-src=https://', slMain) then begin
       sLine := sys_RegexReplaceAll ( slMain [ iRow ], ' image-src=' , ' image-src=>' );
       sLine := sys_RegexReplaceAll ( sLine, ' image-dimen=', '<image-dien=' );
       sLine := on_cleanHTMLLine (sLine);
       on_setPicture ( sLine );
     end;
    iRow := 0;
  end;

  //---   used for genre and type strings, not populated by current amazon
//  function GetListing: string;
//  begin
//    result := Trim( on_cleanHTMLLine( AnsiReplaceText( slMain[ iRow ], '</a>', '</a>' + LIST_SEP ) ) );
//    while Pos( '  ', result ) > 0 do result :=  AnsiReplaceText( result, '  ', ' ' );
//    result := AnsiReplaceText( result, LIST_SEP + ' ', LIST_SEP );
//    if Copy( result, Length( result ), 1 ) = LIST_SEP then result := Copy( result, 1, Length( result ) - 1 );
//  end;

  //---
  procedure GetTracks;
  var
    sID, sTrack, sComposer, sPerformer, sTitle, sTime: string;
    iTrack: integer;
  begin

    iTrack := 0;

    while on_FindRow( iRow, 0, '<music-text-row data-key=', slMain ) do begin

       //same track ?
       if Pos( '<music-text-row data-key=' + IntToStr( iTrack ), slMain[ iRow ] ) = 0 then begin
          sID := Copy( slMain[ iRow ], Pos( 'data-key=', slMain[ iRow ] ) + 9, 999 );
          sID := Copy( sID, 1,  Pos( 'undefined', sID ) - 1 ) ;
          Inc( iTrack );
          sTrack := IntToStr( iTrack );
          sTitle := sID;

          //Track artist?
//          if on_FindRow( iRow, 0, '/ref=dm_ws_tlw_art' + sTrack + '"', slMain ) then begin
//             sPerformer := on_cleanHTMLLine( slMain[ iRow ] );
//          end;
          //Duration
          if on_FindRow( iRow, 0, sID + '</span>', slMain ) then begin
            if on_FindRow ( iRow, 0, '<span>', slMain) then begin
           //while on_FindRow ( iRow, 1, '<span>', slMain) do begin
              sTime := on_cleanHTMLLine (slMain [iRow +1]);
            // on_setComment ( slMain [iRow +1] ) ;
            end;
          //   sTime := on_cleanHTMLLine (slMain[ iRow + 24 ]);
          //  end;
          end;

          if on_getIsVarious and (sPerformer <> '') then  sTitle := sPerformer + ' / ' + sTitle;

          on_addTrack( sTrack, sTitle, sComposer, sTime );

       end;

       Inc(iRow);

    end;
  end;

  //---
//  procedure GetReview;
//  begin
//    on_setComment( on_cleanHTMLMultiLine( slMain[ iRow + 2 ] ) );
//  end;

begin

  on_Init(1); // comment this out if you wish to retain current values
  slMain := TStringList.Create;

  try

    iRow := 0;
    on_loadHTML( slMain, false, true ); // load the actual page to a stringlist
    slMain.SaveToFile( 'c:\test.htm' ); //uncomment if you want to see the HTML code

    // image if any
    GetImage;

    // album
    if on_FindRow( iRow, 0, '<music-detail-header image-src=https://', slMain) then begin
       sTmp := sys_RegexReplaceAll ( slMain [ iRow ], ' headline=' , ' headline=>' );
       sTmp := sys_RegexReplaceAll ( sTmp, ' class=', '<class=' );
            on_setAlbum ( on_cleanHTMLLine (sTmp) ) ;
       iRow := 0;
    end;
    //artist
    if on_FindRow( iRow, 0, '<music-detail-header image-src=https://', slMain) then begin
       sTmp := sys_RegexReplaceAll ( slMain [ iRow ], ' primary-text=' , ' primary-text=>' );
       sTmp := sys_RegexReplaceAll ( sTmp, ' primary-text-href=', '<primary-text-href=' );
// this probably doesn't work to remove "The" from the name of an artist... check later
//       on_cleanHTMLLine( sTmp );
       //if Copy( sTmp, 0, 4) = 'The ' then sTmp := Copy( sTmp, 5, 9999 ); // uncomment to get rid of THE <artist name>
       on_setArtist ( on_cleanHTMLLine ( sTmp ));
       if on_getIsVarious then on_setAlbumArtist ( sTmp );
       iRow := 0;
    end;
    //year
    if on_FindRow( iRow, 0, '<music-detail-header image-src=https://', slMain) then begin
       sTmp := sys_RegexReplaceAll ( slMain [ iRow ], ' tertiary-text=' , ' tertiary-text=>' );
       sTmp := sys_RegexReplaceAll ( sTmp, ' image-kind=', '<image-kind=' );
       sTmp := on_cleanHTMLLine (sTmp);
       sTmp := sys_RegexReplace (sTmp, '(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)', '' );
//       on_setComment ( sTmp );
       on_setYear( sTmp );
       iRow := 0;
    end;
    //label
    if on_FindRow( iRow, 0, '<strong>Label:</strong>', slMain) then begin
       sTmp := on_cleanHTMLLine( slMain[ iRow ] );
       on_setLabel( Copy( sTmp, 8, 9999 ) );
       iRow := 0;
    end;
// amazon doesn't seem to have genre info?
    //genre
//    if on_FindRow( iRow, 0, '<strong>Genres:</strong>', slMain) then begin
//       if on_FindRow( iRow, 0, '<li><a href="', slMain) then begin
//          sTmp := GetListing;
//          iPos := Pos( ',', sTmp );
//          if iPos > 0 then begin
//             on_setGenre( Copy( sTmp, 1, iPos -1  ) );
//             on_setStyles( sTmp );
//          end else begin
//             on_setGenre( sTmp );
//          end;
//       end;
//       iRow := 0;
//    end;

    //type
//    if on_FindRow( iRow, 0, '<li><b>Format:</b>', slMain) then begin
//       sTmp := Trim( Copy( GetListing, 8, 999 ) );
//       on_setType( sTmp );
//       iRow:=0;
//    end;

    // Tracks
    if on_FindRow( iRow, 0, '<music-text-row data-key=', slMain ) then begin
//       on_setComment ( slMain [ iRow]  ) ;
       GetTracks;
       iRow:=0;
    end;

    //Review
    //if on_FindRow( iRow, 0, '?????"', slMain ) then GetReview;

  finally
    slMain.Free;
    sys_SetStatusText( 0, 'Done' );
  end;

end.

Skunky
Newbie
Newbie
Posts: 17
Joined: Sat Apr 06, 2019 3:57 pm

Re: amazon.sco

Post by Skunky »

ok, here's a slight update, going to the amazon music page reliably rather than the amazon homepage, and maybe getting various artist things correct.

there are various places where I've made commentary for myself, and a couple of debugging things... but, again, it mostly works, for me.

Code: Select all

{https://www.amazon.com/music/player/albums}
//{https://www.amazon.com}


{?
 TITLE=FORM;field-keywords=%:str
 ALBUM=FORM;field-keywords=%:str
 ARTIST=FORM;field-keywords=%:str
/?}


Program Amazon_streaming;
const LIST_SEP = ',';
var
  slMain: TStringList;
//  iPos: integer;
  iRow: integer;
  sTmp: String;

  //---
  procedure GetImage;
  var
    sLine: string;
  begin
     if on_FindRow( iRow, 0, '<music-detail-header image-src=https://', slMain) then begin
       sLine := sys_RegexReplaceAll ( slMain [ iRow ], ' image-src=' , ' image-src=>' );
       sLine := sys_RegexReplaceAll ( sLine, ' image-dimen=', '<image-dien=' );
       sLine := on_cleanHTMLLine (sLine);
       on_setPicture ( sLine );
     end;
    iRow := 0;
  end;

  //---   used for genre and type strings, not populated by current amazon
//  function GetListing: string;
//  begin
//    result := Trim( on_cleanHTMLLine( AnsiReplaceText( slMain[ iRow ], '</a>', '</a>' + LIST_SEP ) ) );
//    while Pos( '  ', result ) > 0 do result :=  AnsiReplaceText( result, '  ', ' ' );
//    result := AnsiReplaceText( result, LIST_SEP + ' ', LIST_SEP );
//    if Copy( result, Length( result ), 1 ) = LIST_SEP then result := Copy( result, 1, Length( result ) - 1 );
//  end;

  //---
  procedure GetTracks;
  var
//    mcomment: string;
    sID, sTrack, sComposer, sPerformer, sTitle, sTime: string;
    iTrack: integer;
  begin

    iTrack := 0;

    while on_FindRow( iRow, 0, '<music-text-row data-key=', slMain ) do begin

       //same track ?
       if Pos( '<music-text-row data-key=' + IntToStr( iTrack ), slMain[ iRow ] ) = 0 then begin
          sID := Copy( slMain[ iRow ], Pos( 'data-key=', slMain[ iRow ] ) + 9, 999 );
          sID := Copy( sID, 1,  Pos( 'undefined', sID ) - 1 ) ;
          Inc( iTrack );
          sTrack := IntToStr( iTrack );
          sTitle := sID;

          //Track artist?
          if on_FindRow( iRow, 0, '<div class=col3>', slMain ) then begin
            if sys_RegexFind ( slMain[iRow +1], '<music-link title=') then begin
//                mcomment := mcomment + slmain [irow +4];
//                on_setcomment (mcomment);
                sPerformer := on_cleanHTMLLine( slMain[ iRow +4 ] )
                end
                else sPerformer := ''
          end;

          //Duration
          if on_FindRow( iRow, 0, '<div class=col4>', slMain ) then begin
            if on_FindRow ( iRow, 0, '<span>', slMain) then begin
              sTime := on_cleanHTMLLine (slMain [iRow +1])
              end
            else sTime := '';
          end;

          if on_getIsVarious and (sPerformer <> '') then  sTitle := sPerformer + ' / ' + sTitle;

          on_addTrack( sTrack, sTitle, sComposer, sTime );

       end;

       Inc(iRow);

    end;
  end;

  //---
//  procedure GetReview;
//  begin
//    on_setComment( on_cleanHTMLMultiLine( slMain[ iRow + 2 ] ) );
//  end;

begin

  on_Init(1); // comment this out if you wish to retain current values
  slMain := TStringList.Create;

  try

    iRow := 0;
    on_loadHTML( slMain, false, true ); // load the actual page to a stringlist
//    slMain.SaveToFile( 'c:\test.htm' ); //uncomment if you want to see the HTML code

    // image if any
    GetImage;

    // album
    if on_FindRow( iRow, 0, '<music-detail-header image-src=https://', slMain) then begin
       sTmp := sys_RegexReplaceAll ( slMain [ iRow ], ' headline=' , ' headline=>' );
       sTmp := sys_RegexReplaceAll ( sTmp, ' class=', '<class=' );
            on_setAlbum ( on_cleanHTMLLine (sTmp) ) ;
       iRow := 0;
    end;
    //artist
    if on_FindRow( iRow, 0, '<music-detail-header image-src=https://', slMain) then begin
       sTmp := sys_RegexReplaceAll ( slMain [ iRow ], ' primary-text=' , ' primary-text=>' );
       sTmp := sys_RegexReplaceAll ( sTmp, ' primary-text-href=', '<primary-text-href=' );
// this probably doesn't work to remove "The" from the name of an artist... check later
//       on_cleanHTMLLine( sTmp );
       //if Copy( sTmp, 0, 4) = 'The ' then sTmp := Copy( sTmp, 5, 9999 ); // uncomment to get rid of THE <artist name>
       on_setArtist ( on_cleanHTMLLine ( sTmp ));
       if on_getIsVarious then on_setAlbumArtist ( on_cleanHTMLLine ( sTmp ));
       iRow := 0;
    end;
    //year
    if on_FindRow( iRow, 0, '<music-detail-header image-src=https://', slMain) then begin
       sTmp := sys_RegexReplaceAll ( slMain [ iRow ], ' tertiary-text=' , ' tertiary-text=>' );
       sTmp := sys_RegexReplaceAll ( sTmp, ' image-kind=', '<image-kind=' );
       sTmp := on_cleanHTMLLine (sTmp);
       sTmp := sys_RegexReplace (sTmp, '(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)(.*?\s)', '' );
//       on_setComment ( sTmp );
       on_setYear( sTmp );
       iRow := 0;
    end;
    //label
    if on_FindRow( iRow, 0, '<strong>Label:</strong>', slMain) then begin
       sTmp := on_cleanHTMLLine( slMain[ iRow ] );
       on_setLabel( Copy( sTmp, 8, 9999 ) );
       iRow := 0;
    end;
// amazon doesn't seem to have genre info?
    //genre
//    if on_FindRow( iRow, 0, '<strong>Genres:</strong>', slMain) then begin
//       if on_FindRow( iRow, 0, '<li><a href="', slMain) then begin
//          sTmp := GetListing;
//          iPos := Pos( ',', sTmp );
//          if iPos > 0 then begin
//             on_setGenre( Copy( sTmp, 1, iPos -1  ) );
//             on_setStyles( sTmp );
//          end else begin
//             on_setGenre( sTmp );
//          end;
//       end;
//       iRow := 0;
//    end;

    //type
//    if on_FindRow( iRow, 0, '<li><b>Format:</b>', slMain) then begin
//       sTmp := Trim( Copy( GetListing, 8, 999 ) );
//       on_setType( sTmp );
//       iRow:=0;
//    end;

    // Tracks
    if on_FindRow( iRow, 0, '<music-text-row data-key=', slMain ) then begin
//       on_setComment ( slMain [ iRow]  ) ;
       GetTracks;
       iRow:=0;
    end;

    //Review
    //if on_FindRow( iRow, 0, '?????"', slMain ) then GetReview;

  finally
    slMain.Free;
    sys_SetStatusText( 0, 'Done' );
  end;

end.

User avatar
jtclipper
Administrator
Administrator
Posts: 772
Joined: Tue Aug 10, 2010 12:04 pm

Re: amazon.sco

Post by jtclipper »

Thank you for the update,
once you finish your modifications notify me and I will include this script in the main install package

Post Reply