Forum
|
|
AlphaBlend API replacement, overlay bitmaps with alpha blending - SSE2 instructions support
|
huntingspace |
|
Active Member
Group: Members
Posts: 49
Member No.: 49504
Joined: 11-February 12
|
This procedure blends two 32-bit bitmaps with an alpha channel. You can use FPU or SSE2 instructions depending of compilation mode. Bitmaps need not "pre-multiply". Parameters: hBmpSrc - A handle to the foreground bitmap. dwSrcX - The x-coordinate, in pixels, of the upper-left corner of the foreground bitmap rectangle. dwSrcY - The y-coordinate, in pixels, of the upper-left corner of the foreground bitmap rectangle. dwSrcWidth - The width, in pixels, of the foreground bitmap rectangle. dwSrcHeight - The height, in logical units, of the foreground bitmap rectangle. hBmpDst - A handle to the foreground bitmap. dwDstX - The x-coordinate, in pixels, of the upper-left corner of the background bitmap rectangle. dwDstY - The y-coordinate, in pixels, of the upper-left corner of the background bitmap rectangle. dwSrcAlpha - Alpha value of of the foreground bitmap (0...255).
CODE | BitmapAlphaBlending32 proc uses esi edi ebx hBmpSrc,dwSrcX,dwSrcY,dwSrcWidth,dwSrcHeight,hBmpDst,dwDstX,dwDstY,dwSrcAlpha:DWORD � �LOCAL BmpSrc:BITMAP � �LOCAL BmpDst:BITMAP � �LOCAL dwSrcMemSize,dwDstMemSize:DWORD � �LOCAL lpSrcMem,lpDstMem:DWORD IFNDEF _MMX_SSE2 � �LOCAL dwPixelVal,dwDiv255:DWORD � �mov dwDiv255,255 ELSE � �LOCAL dwRemainder:DWORD ENDIF � �cmp hBmpSrc,0 � �je failexit � �cmp hBmpDst,0 � �je failexit � �cmp dwSrcWidth,0 � �jle failexit � �cmp dwSrcHeight,0 � �jle failexit � �mov eax,dwSrcAlpha � �test eax,eax � �jl failexit � �cmp eax,255 � �jg failexit � �mov lpSrcMem,0 � �mov lpDstMem,0 � �invoke GetObject,hBmpSrc,sizeof BITMAP,addr BmpSrc � �invoke GetObject,hBmpDst,sizeof BITMAP,addr BmpDst � �cmp BmpSrc.bmBitsPixel,32 �; 32-bit foreground bitmap only � �jnz failexit � �cmp BmpDst.bmBitsPixel,32 �; 32-bit background bitmap only � �jnz failexit � �mov eax,dwSrcX � �test eax,eax � �jge @F � �neg eax � � � � � � � � � �; Invert if negative value � �add dwDstX,eax � �add dwSrcX,eax @@: mov eax,dwSrcY � �test eax,eax � �jge @F � �neg eax � � � � � � � � � �; Invert if negative value � �add dwDstY,eax � �add dwSrcY,eax @@: mov eax,dwDstX � �test eax,eax � �jge @F � �neg eax � � � � � � � � � �; Invert if negative value � �add dwSrcX,eax � �add dwDstX,eax @@: mov eax,dwDstY � �test eax,eax � �jge @F � �neg eax � � � � � � � � � �; Invert if negative value � �add dwSrcY,eax � �add dwDstY,eax @@: mov eax,BmpSrc.bmWidth � �sub eax,dwSrcX � �jle failexit � �cmp dwSrcWidth,eax � �jle @F � �mov dwSrcWidth,eax @@: mov eax,BmpSrc.bmHeight � �sub eax,dwSrcY � �jle failexit � �cmp dwSrcHeight,eax � �jle @F � �mov dwSrcHeight,eax @@: mov eax,BmpDst.bmWidth � �sub eax,dwDstX � �jle failexit � �cmp dwSrcWidth,eax � �jle @F � �mov dwSrcWidth,eax @@: mov eax,BmpDst.bmHeight � �sub eax,dwDstY � �jle failexit � �cmp dwSrcHeight,eax � �jle @F � �mov dwSrcHeight,eax @@: mov ebx,BmpDst.bmWidth � �imul ebx,dwDstY � �add ebx,dwDstX � �mov edi,BmpDst.bmBits � �test edi,edi � �jnz @F �; Get DDB background bitmap bits � �mov eax,BmpDst.bmWidth � �mul BmpDst.bmHeight � �shl eax,4 � �mov dwDstMemSize,eax � �invoke LocalAlloc,LMEM_FIXED,eax � �test eax,eax � �je failexit � �mov lpDstMem,eax � �invoke GetBitmapBits,hBmpDst,dwDstMemSize,lpDstMem � �test eax,eax � �je failexit � �mov edi,lpDstMem @@: lea edi,[edi+ebx*4] � �mov ebx,BmpSrc.bmWidth � �imul ebx,dwSrcY � �add ebx,dwSrcX � �mov esi,BmpSrc.bmBits � �test esi,esi � �jnz @F �; Get DDB foreground bitmap bits � �mov eax,BmpSrc.bmWidth � �mul BmpSrc.bmHeight � �shl eax,4 � �mov dwSrcMemSize,eax � �invoke LocalAlloc,LMEM_FIXED,eax � �test eax,eax � �je failexit � �mov lpSrcMem,eax � �invoke GetBitmapBits,hBmpSrc,dwSrcMemSize,lpSrcMem � �test eax,eax � �je failexit � �mov esi,lpSrcMem @@: lea esi,[esi+ebx*4] � �shl BmpSrc.bmWidth,2 � �shl BmpDst.bmWidth,2 IFDEF _MMX_SSE2 � �mov eax,dwSrcWidth � �shr dwSrcWidth,2 � �and eax,3 � �mov dwRemainder,eax ; --------------------------------------- � �; Test on SSE2 availability if it need ; � �xor eax,eax ; � �inc eax ; � �cpuid ; � �test edx,4000000h ; � �je failexit ; --------------------------------------- � �cmpeqpd � xmm6,xmm6 � �psrlw � � xmm6,8 � �movd � � �xmm5,dwSrcAlpha � �pshuflw � xmm5,xmm5,0 � �pshufd � �xmm5,xmm5,0 � �pmullw � �xmm6,xmm5 � �psrlw � � xmm6,8 ENDIF ; --------------------------------------- ; A = source alpha: 0...255 ; C = compositing color (c, cr, cg, cb) ; F = foreground color �(f, fr, fg, fb) ; B = background color �(b, br, bg, bb) ; a = (f/255)*(A/255) ; Composite: C = (1-a)*B + a*F ; --------------------------------------- � �Foreground � �equ <esi+edx> � �Background � �equ <edi+edx> � �Composite � � equ <Background> ; --------------------------------------- outloop: � �mov ecx,dwSrcWidth IFDEF _MMX_SSE2 � �mov eax,dwRemainder � �test eax,eax � �je inloop2 � �mov ebx,ecx � �shl ebx,4 � �dec eax � �pxor xmm1,xmm1 � �pxor xmm3,xmm3 inloop1: � �lea edx,[ebx+eax*4] � �movd � � �xmm0,dword ptr [Foreground] � �movd � � �xmm2,dword ptr [Background] � �pshuflw � xmm4,xmm0,01010101b � �psrlw � � xmm4,8 � �pmullw � �xmm4,xmm6 � �pxor � � �xmm7,xmm7 � �punpcklbw xmm0,xmm7 � �punpcklbw xmm2,xmm7 � �pmulhuw � xmm0,xmm4 � �packuswb �xmm0,xmm1 � �cmpeqpd � xmm7,xmm7 � �pxor � � �xmm4,xmm7 � �pmulhuw � xmm2,xmm4 � �packuswb �xmm2,xmm3 � �paddusb � xmm0,xmm2 � �movd � � �dword ptr [Composite],xmm0 � �dec eax � �jge inloop1 � �test ecx,ecx � �je @F inloop2: � �lea edx,[ecx-1] � �shl edx,4 � �movdqu � �xmm0,[Foreground] � �movdqu � �xmm2,[Background] � �pshuflw � xmm4,xmm0,01010101b � �pshuflw � xmm3,xmm0,11111111b � �movlhps � xmm4,xmm3 � �pshufhw � xmm3,xmm0,01010101b � �pshufhw � xmm5,xmm0,11111111b � �movhlps � xmm5,xmm3 � �psrlw � � xmm4,8 � �psrlw � � xmm5,8 � �pmullw � �xmm4,xmm6 � �pmullw � �xmm5,xmm6 � �movdqu � �xmm1,xmm0 � �movdqu � �xmm3,xmm2 � �pxor � � �xmm7,xmm7 � �punpcklbw xmm0,xmm7 � �punpckhbw xmm1,xmm7 � �punpcklbw xmm2,xmm7 � �punpckhbw xmm3,xmm7 � �pmulhuw � xmm0,xmm4 � �pmulhuw � xmm1,xmm5 � �packuswb �xmm0,xmm1 � �cmpeqpd � xmm7,xmm7 � �pxor � � �xmm4,xmm7 � �pxor � � �xmm5,xmm7 � �pmulhuw � xmm2,xmm4 � �pmulhuw � xmm3,xmm5 � �packuswb �xmm2,xmm3 � �paddusb � xmm0,xmm2 � �movdqu � �[Composite],xmm0 ELSE inloop2: � �lea ebx,[ecx*4-5] � �lea edx,[ebx+4] � �movzx eax,byte ptr [Foreground] � �mov dwPixelVal,eax � �fild dwPixelVal � �fidiv dwDiv255 � �fimul dwSrcAlpha � �fidiv dwDiv255 � �fld1 � �fsub st(0),st(1) smloop: � �movzx eax,byte ptr [Foreground] � �mov dwPixelVal,eax � �fild dwPixelVal � �fmul st(0),st(2) � �movzx eax,byte ptr [Background] � �mov dwPixelVal,eax � �fild dwPixelVal � �fmul st(0),st(2) � �faddp st(1),st(0) � �fistp dwPixelVal � �mov eax,dwPixelVal � �mov byte ptr [Composite],al � �dec edx � �cmp edx,ebx � �jnz smloop � �fstp st(0) � �fstp st(0) ENDIF � �dec ecx � �jg inloop2 @@: add esi,BmpSrc.bmWidth � �add edi,BmpDst.bmWidth � �dec dwSrcHeight � �jg outloop � �cmp lpDstMem,0 � �je @F �; Set DDB background bitmap bits � �invoke SetBitmapBits,hBmpDst,dwDstMemSize,lpDstMem � �invoke LocalFree,lpDstMem @@: cmp lpSrcMem,0 � �je @F �; Set DDB foreground bitmap bits � �invoke SetBitmapBits,hBmpSrc,dwSrcMemSize,lpSrcMem � �invoke LocalFree,lpSrcMem @@: xor eax,eax � �inc eax � � � � � � � � � �; Success: eax = TRUE � �ret failexit: � �xor eax,eax � � � � � � � �; Failure: eax = FALSE � �ret BitmapAlphaBlending32 endp
|
|
|
|
|
huntingspace |
|
Active Member
Group: Members
Posts: 49
Member No.: 49504
Joined: 11-February 12
|
New procedure blends two 24-bit bitmaps. You also can use FPU or SSE2 instructions. Parameters are the same.
CODE | BitmapAlphaBlending24 proc uses esi edi ebx hBmpSrc,dwSrcX,dwSrcY,dwSrcWidth,dwSrcHeight,hBmpDst,dwDstX,dwDstY,dwSrcAlpha:DWORD � �LOCAL BmpSrc:BITMAP � �LOCAL BmpDst:BITMAP IFNDEF _MMX_SSE2 � �LOCAL dwPixelVal,dwDiv255:DWORD � �mov dwDiv255,255 ELSE � �LOCAL dwRemainder:DWORD ENDIF � �cmp hBmpSrc,0 � �je failexit � �cmp hBmpDst,0 � �je failexit � �cmp dwSrcWidth,0 � �jle failexit � �cmp dwSrcHeight,0 � �jle failexit � �mov eax,dwSrcAlpha � �test eax,eax � �jl failexit � �cmp eax,255 � �jg failexit � �invoke GetObject,hBmpSrc,sizeof BITMAP,addr BmpSrc � �invoke GetObject,hBmpDst,sizeof BITMAP,addr BmpDst � �cmp BmpSrc.bmBitsPixel,24 �; 24-bit foreground bitmap only � �jnz failexit � �cmp BmpDst.bmBitsPixel,24 �; 24-bit background bitmap only � �jnz failexit � �mov eax,dwSrcX � �test eax,eax � �jge @F � �neg eax � � � � � � � � � �; Invert if negative value � �add dwDstX,eax � �add dwSrcX,eax @@: mov eax,dwSrcY � �test eax,eax � �jge @F � �neg eax � � � � � � � � � �; Invert if negative value � �add dwDstY,eax � �add dwSrcY,eax @@: mov eax,dwDstX � �test eax,eax � �jge @F � �neg eax � � � � � � � � � �; Invert if negative value � �add dwSrcX,eax � �add dwDstX,eax @@: mov eax,dwDstY � �test eax,eax � �jge @F � �neg eax � � � � � � � � � �; Invert if negative value � �add dwSrcY,eax � �add dwDstY,eax @@: mov eax,BmpSrc.bmWidth � �sub eax,dwSrcX � �jle failexit � �cmp dwSrcWidth,eax � �jle @F � �mov dwSrcWidth,eax @@: mov eax,BmpSrc.bmHeight � �sub eax,dwSrcY � �jle failexit � �cmp dwSrcHeight,eax � �jle @F � �mov dwSrcHeight,eax @@: mov eax,BmpDst.bmWidth � �sub eax,dwDstX � �jle failexit � �cmp dwSrcWidth,eax � �jle @F � �mov dwSrcWidth,eax @@: mov eax,BmpDst.bmHeight � �sub eax,dwDstY � �jle failexit � �cmp dwSrcHeight,eax � �jle @F � �mov dwSrcHeight,eax @@: mov ebx,BmpDst.bmWidth � �lea ebx,[ebx*2+ebx+3] � �and ebx,-4 � �imul ebx,dwDstY � �mov ecx,dwDstX � �lea ecx,[ecx*2+ecx] � �add ebx,ecx � �mov edi,BmpDst.bmBits � �test edi,edi � �je failexit � �lea edi,[edi+ebx] � �mov ebx,BmpSrc.bmWidth � �lea ebx,[ebx*2+ebx+3] � �and ebx,-4 � �imul ebx,dwSrcY � �mov ecx,dwSrcX � �lea ecx,[ecx*2+ecx] � �add ebx,ecx � �mov esi,BmpSrc.bmBits � �test esi,esi � �je failexit � �lea esi,[esi+ebx] � �mov eax,BmpSrc.bmWidth � �lea eax,[eax*2+eax+3] � �and eax,-4 � �mov BmpSrc.bmWidth,eax � �mov eax,BmpDst.bmWidth � �lea eax,[eax*2+eax+3] � �and eax,-4 � �mov BmpDst.bmWidth,eax IFDEF _MMX_SSE2 � �mov eax,dwSrcWidth � �shr dwSrcWidth,2 � �and eax,3 � �mov dwRemainder,eax � �cmpeqpd � xmm6,xmm6 � �psrlw � � xmm6,8 � �movd � � �xmm5,dwSrcAlpha � �pshuflw � xmm5,xmm5,0 � �pshufd � �xmm5,xmm5,0 � �pmullw � �xmm6,xmm5 ENDIF ; --------------------------------------- ; A = source alpha: 0...255 ; C = compositing color (cr, cg, cb) ; F = foreground color �(fr, fg, fb) ; B = background color �(br, bg, bb) ; a = A / 255 ; Composite: C = (1-a)*B + a*F ; --------------------------------------- � �Foreground � �equ <esi+edx> � �Background � �equ <edi+edx> � �Composite � � equ <Background> ; --------------------------------------- outloop: � �mov ecx,dwSrcWidth IFDEF _MMX_SSE2 � �mov eax,dwRemainder � �test eax,eax � �je inloop2 � �lea ebx,[ecx*2+ecx] � �add ebx,ebx � �add ebx,ebx � �dec eax � �pxor xmm1,xmm1 � �pxor xmm3,xmm3 inloop1: � �lea edx,[eax*2+eax] � �add edx,ebx � �movd � � �xmm0,dword ptr [Foreground] � �movd � � �xmm2,dword ptr [Background] � �movdqu � �xmm5,xmm2 � �pxor � � �xmm7,xmm7 � �punpcklbw xmm0,xmm7 � �punpcklbw xmm2,xmm7 � �movdqu � �xmm4,xmm6 � �pmulhuw � xmm0,xmm4 � �packuswb �xmm0,xmm1 � �cmpeqpd � xmm7,xmm7 � �pxor � � �xmm4,xmm7 � �pmulhuw � xmm2,xmm4 � �packuswb �xmm2,xmm3 � �paddusb � xmm0,xmm2 � �movdqu � �xmm4,xmm7 � �psrldq � �xmm7,13 � �pslldq � �xmm4,3 � �pand � � �xmm0,xmm7 � �pand � � �xmm5,xmm4 � �por � � � xmm0,xmm5 � �movd � � �dword ptr [Composite],xmm0 � �dec eax � �jge inloop1 � �test ecx,ecx � �je @F inloop2: � �lea edx,[ecx-1] � �lea edx,[edx*2+edx] � �add edx,edx � �add edx,edx � �movdqu � �xmm0,[Foreground] � �movdqu � �xmm2,[Background] � �movdqu � �xmm5,xmm2 � �movdqu � �xmm1,xmm0 � �movdqu � �xmm3,xmm2 � �pxor � � �xmm7,xmm7 � �punpcklbw xmm0,xmm7 � �punpckhbw xmm1,xmm7 � �punpcklbw xmm2,xmm7 � �punpckhbw xmm3,xmm7 � �movdqu � �xmm4,xmm6 � �pmulhuw � xmm0,xmm4 � �pmulhuw � xmm1,xmm4 � �packuswb �xmm0,xmm1 � �cmpeqpd � xmm7,xmm7 � �pxor � � �xmm4,xmm7 � �pmulhuw � xmm2,xmm4 � �pmulhuw � xmm3,xmm4 � �packuswb �xmm2,xmm3 � �paddusb � xmm0,xmm2 � �movdqu � �xmm4,xmm7 � �psrldq � �xmm7,4 � �pslldq � �xmm4,12 � �pand � � �xmm0,xmm7 � �pand � � �xmm5,xmm4 � �por � � � xmm0,xmm5 � �movdqu � �[Composite],xmm0 ELSE inloop2: � �lea ebx,[ecx*2+ecx-4] � �lea edx,[ebx+3] � �fild dwSrcAlpha � �fidiv dwDiv255 � �fld1 � �fsub st(0),st(1) smloop: � �movzx eax,byte ptr [Foreground] � �mov dwPixelVal,eax � �fild dwPixelVal � �fmul st(0),st(2) � �movzx eax,byte ptr [Background] � �mov dwPixelVal,eax � �fild dwPixelVal � �fmul st(0),st(2) � �faddp st(1),st(0) � �fistp dwPixelVal � �mov eax,dwPixelVal � �mov byte ptr [Composite],al � �dec edx � �cmp edx,ebx � �jnz smloop � �fstp st(0) � �fstp st(0) ENDIF � �dec ecx � �jg inloop2 @@: add esi,BmpSrc.bmWidth � �add edi,BmpDst.bmWidth � �dec dwSrcHeight � �jg outloop � �xor eax,eax � �inc eax � � � � � � � � � �; Success: eax = TRUE � �ret failexit: � �xor eax,eax � � � � � � � �; Failure: eax = FALSE � �ret BitmapAlphaBlending24 endp
|
P.S. let me know if you find some bugs in algo.
Attached File ( Number of downloads: 38 )
Login or Register to download
|
|
|
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:
Track this topic
Receive email notification when a reply has been made to this topic and you are not active on the board.
Subscribe to this forum
Receive email notification when a new topic is posted in this forum and you are not active on the board.
Download / Print this Topic
Download this topic in different formats or view a printer friendly version.
|
|
|